CANADIAN 
DURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. XII - NO. 4 & mong: 
1960 NGY 15 1960 

On immersion of manifolds kians Samelson 
The enumeration of point labelled chromatic 

graphs and trees T. L. Austin 
Some generalized theorems on connectivity R. E. Nettleton 
On orientations, connectivity and odd-vertex-pairings 

in finite graphs C. St. J. A. Nash-Williams 
On the projective centres of 

convex curves Paul Kelly and E. G. Straus 
A semimodular imbedding of lattices D. T. Finkbeiner 
Intersection irreducible ideals of a non-commutative 

principal ideal domain Edmund H. Feller 
On continuous regular rings and semisimple self 

injective rings Yuzo Utumi 
Automorphisms of finite linear groups Robert Steinberg 
Invariants of finite reflection groups Robert Steinberg 
A metrical theorem in diophantine 

approximation Wolfgang Schmidt 
A local property of measurable sets W. Eames 
On a class of non-self-adjoint differential 

Operators R. R. D. Kemp 
The Gibbs phenomenon for Taylor means 

and for |F,d,| means © Chester L. Miracle 


Continuous translation of Holder and Lipschitz 

functions H. Mirkil 
On Lie semi-groups R. P. Langlands 
The expression of trigonometrical series in Fourier 

form George Cross 
On a discriminant inequality L. J. Mordell 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 
University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, G. F. D. Duff, R. D. James, R. L. Jeffery, 
J..M. Maranda, G. de B. Robinson, P. Scherk 


with the co-operation of 


B. DeLury, J. Dixmier, W. Fenchel, H. Freudenthal, I. Kaplansky, 
S. Mendelsohn, C. A. Rogers, H. Schwerdtfeger, A. W. Tucker, 
W. J. Webber, M. Wyman 


D. 
N. 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, G. F. D. Duff, University of Toronto. Authors are 
asked to write with a sense of perspective and as clearly as possible, 
especially in the introduction. Regarding typographical conventions, 
attention is drawn to the Author's Manual of which a copy will be 
furnished on request. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers 
is $10.00. This is reduced to $5.00 for individual members of recognized 
Mathematical Societies. 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of Alberta Assumption University 
University of British Columbia Carleton College 
Dalhousie University Ecole Polytechnique 
Université Laval Loyola College 
University of Manitoba McGill University 
McMaster University Université de Montréal 
Mount Allison University Nova Scotia Technical College 
Queen’s University St. Mary’s University 
University of Saskatchewan University of Toronto 

National Research Council of Canada 

and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








a 


ON IMMERSION OF MANIFOLDS 
HANS SAMELSON 


1. Introduction. In (3) R. Lashof and S. Smale proved among other 
things the following theorem. Jf the compact oriented manifold M is immersed 
into the oriented manifold M’', with dim M’> dim M + 2, then the normal 
degree of the immersion is equal to the Euler-Poincaré characteristic x of M 
reduced module the characteristic x' of M’. If M’ is not compact, x’ is replaced 
by 0. ‘‘Manifold’”’ always means C”-manifold. An immersion is a differentiable 
(that is, C”) map f whose differential df is non-singular throughout. The normal 
degree is defined in a certain fashion using the normal bundle of M in M’, 
derived from f, and injecting it into the tangent bundle of M’. 

It is our purpose to give an elementary proof, using vector fields, of this 
theorem, and at the same time to identify the homology class that represents 
the normal degree (Theorem I), and to give a proof, using the theory of 
Morse, for the special case M’ = Euclidean space (Theorem II). The proof 
of Theorem II consists of a slight addition to arguments due to Chern and 
Lashof (1; 2). We introduce some notation. If x is a point of M, we write M, 
for the tangent space of M at x; if g is a (C°) map of M, we write dg for the 
differential of g, that is, the induced map of the tangent vectors; and we write 
g« for the induced map of the homology group Hs (M) (homology is always 
meant as singular homology with integral coefficients); these conventions 
apply to all manifolds. Let 7’ be the bundle of non-zero tangent vectors of 
M’, and let S’ be the direction sphere of 7’ at some point g € M’, that is, 
the unit sphere of M,’, with respect to some Euclidean metric in M,’. Let s’ 
denote the element of the homology group Hy (7"), respresented by the basic 
cycle of S’ with the given orientation of M’. 

Given an auxiliary Riemannian metric in a neighbourhood of f(M) in M’, 
the normal bundle B, of M under f consists of all pairs (x, v) where x € M, 
and v is a unit tangent vector of M’ at f(x), orthogonal to df(M,). We write 
y for the map of B, into 7”, given by (x, v) 9; this is the normal map or 
Gauss map. The manifold B, receives a definite orientation from the orienta- 
tions of M and M’; let b, be the corresponding basic homology class; the 
dimension of B, is equal to that of S’. The image of b, in Hy(T’) under the 
homology map v* induced by » is called the normal degree of f. Our result 
then takes the following form. 


THeorREM I. The normal degree of f is x. s’ (in H(T")). 


Received September 1, 1959. The work on this note was supported by Air Force contract 
AF 49(638-104). 


529 











530 HANS SAMELSON 


It is well known that in case of compact M’ the element s’ is of order x’ 
in H,(T’); this is the theorem that the sum of the indices of a vector field 
equals the characteristic. On the other hand, if M’ is not compact, then s’ 
generates an infinite cyclic group in Hs(T7’). The theorem therefore allows us 
to identify the normal degree with the integer x reduced mod x’ (resp. mod 0); 
this is the result of Lashof and Smale. The proof of Theorem I appears in 
sections 2 and 3 below. 


2. A construction in vector bundles. We begin with the prototype of 
immersion. Let M be as above a compact oriented C” manifold, and let E 
be an oriented C” vector bundle over M, with fibre a vector space of some 
(finite) dimension; we write p for the projection E — M. We introduce in E 
an auxiliary Riemannian metric which on each fiber is translation invariant, 
so that each fibre carries a Euclidean metric. Without loss of generality we 
may assume that the 0-section of E, which we identify with M, is orthogonal 
to all the fibres. The metric defines the unit ball bundle A and the unit sphere 
bundle B: a point y of E belongs to A (resp. B) if the norm |y| of y, computed 
in the fibre of y, is <1 (resp. =1). A isa bounded manifold, with B as boundary, 
with natural orientations induced from E. On B we consider a vector field N, 
the “‘exterior normal’’ which assigns to each point y of B the tangent of the 
curve (line) {ty: ¢ € reals} at y ,that is, for ¢ = 1. We propose to extend N 
over all of A with a single singularity. To this end we choose in M (=0-section 
of E) a vector field F with a single singularity, that is, a continuous vector 
field that vanishes at a single point x. We extend F to a vector field F; on A 
by defining F,(y) as the vector orthogonal to the fibre of y, that projects into 
F(p(y)), multiplied by 1 — |y|; we note explicitly that F; vanishes on B. As 
a second step we extend N to a vector field F, on A by using exactly the 
definition of NV: F;(y) is the derivative vector (not the unit tangent vector) of 
the curve {ty: ¢ € reals} for ¢ = 1. Then F;,(y) is always tangent to the fibre 
of y and vanishes for |y| = 0, that is, on M. 

We now define a vector field G on A by G = F,; + Fo, meaning 
G(y) = Fi(y) + F:(y) for y € A. One verifies immediately from the properties 
given above that G is an extension of N, and that G vanishes only at the point x. 

The main point of our argument is now the following contention. 


(1) The indices of the vector fields F and G at their singular point x are equal. 


This is purely a local matter. We take a neighbourhood V of x in M, 
homeomorphic to Euclidean space, with x corresponding to the origin. We 
may assume that in terms of this Euclidean structure the directions of the 
vectors of the field F are constant along rays from the origin; if necessary we 
deform F a bit. From the local product structure of the bundle E and the 
definition G = F; + F, one verifies then that the map of a sphere around «x in 
E into the unit sphere S of the tangent space E,, derived from G and defining 
the index of G at x, is homotopic to the join of the two corresponding maps for 














@s 











IMMERSION OF MANIFOLDS 531 


F, and F;. The degree for maps of spheres behaves multiplicatively under 
forming joins, and the index of F; is clearly equal to 1. This proves (1). 

The index of F at x is well known to equal the characteristic x of M. We 
shall interpret all this in homology language: 

Let 7, denote the restriction to A of the bundle of all non-zero tangent 
vectors of E (we could use unit vectors instead). Let s denote the element of 
H,(T 4) represented by the basic cycle on the positively oriented unit sphere 
Sin E,, and let 6 denote the basic homology class of B in the given orientation: 
both s and 6 are of infinite order. The vector field N is a map of B into 7,, 
in fact a section over B. 

Then the statement about the index of G can be phrased as follows: 


(2) Nw(b) = x. (in H¢(T,)). 


To prove this we change our point of view of the field G. Instead of having it 
vanish at x, we deform it so that it is radially constant near x. We construct 
the bounded orientable manifold A obtained from A by replacing x by S, 
with the topology so defined that a neighbourhood of a unit vector v at x 
consists of v and all points of A — {x} near x in a cone around »v (cf. the 
construction of 7 in (3). The boundary of A is B — S in the usual notation. 
The vector field G defines a map of A into 74, coinciding with N on B, and 
mapping S into itself with degree equal to the index of G at x. This clearly 
proves (2). 

Next we note that there is a natural map J of B into 7,4, mapping y into the 
uni vector tangent to {ty} for ¢ = 0; and the two maps J and N are homotopic 
(the image vector sliding on the ray of y from the origin to y). From (2) we 
get 
(3) Ie(b) = x. sin He(T,). 


Finally let 7» be the restriction of T, to M; clearly 7, is a strong deforma- 
tion retract of 7,, by “radial contraction”; from J(B) C T» and (3) we get 


(4) Ix(b) = x. sin He(T»), 


with the obvious meaning of s. 


3. Application to immersion. Suppose now M is immersed into the 
manifold M’ by a map f, as in the introduction (dim M’ > dim M). In addition 
to the notation and concepts already defined there we consider the normal 
(vector) bundle E,, consisting of all pairs (x, v) with x € M and v € My)’, 
orthogonal to df(M,); we use the metric of My)’ for v. We apply to E, the 
considerations of (2), using a subscript » where applicable (thus B, is the 
normal unit sphere bundle, etc.). 

Let h be the exponential map of E, into M’, constructed by means of the 
Riemannian metric in M’; if the metric is defined only in a neighbourhood of 
f(M), then h is defined in a neighbourhood of M in E,. The differential of h 











532 HANS SAMELSON 


maps 7» (the tangent vectors of E, at points of M) into T’ in a non-degenerate 
fashion. This implies dhs(s,) = s’. Further, the composition of the map J, 
defined above, and dh is just the normal map v. Applying dhs to (4) we 
obtain: 


ve(b,) = x. 5s’ in He(T”’), 


thus proving Theorem /. 

It should be noted that Theorem I, as stated, holds also in the case 
dim M’ = dim M + 1; but in this case the customary concept of normal 
degree, in particular in the case M’ = Euclidean space, is somewhat different. 
The reason is that B, here consists of two copies of M. 


4. Immersion in Euclidean space. Suppose we have again the situation 
of Theorem I, but that now M’ is a Euclidean space E*, with a fixed orientation 
and Euclidean metric. We keep the same notation, but make use of the usual 
identification of E* with its various tangent spaces. E, is the normal bundle, 
the pairs (x, v) with x € M and v € E*, orthogonal to the subspace df(M,). 
Requiring |v| = 1 we get B,. The map (x, v) — x is the projection p. The 
map (x,v) — 2, still called », is now regarded as a map of B, into the unit 
sphere S*~' of E*. Since orientations are fixed on these two manifolds, the degree 
of » is well defined; this is again called the normal degree of f; we write n, for 
it now. 


THEOREM II. my; = x. 


Remark. In the case k = dim M + 1 (and odd k) the normal degree, as 
defined here, is twice the usual normal degree, since B, consists of two copies 
of M, one to either side of M; for even k both integers are 0; the contributions 
of the two parts of B, to nm; cancel out. 

For a detailed description of the facts used below see (1), particularly 
pp. 310-312 and (2, p. 8); all we add to the arguments given there is our 
relation (5). 

By Sard’s theorem there exists a vector v» € S*-' such that at each point 
y € B, with v(y) = vo the differential of the map »v (which maps the tangent 
space of B, at y into the tangent space of S*~' at vo) is non-degenerate; there 
are only a finite number of such points y. Let the function ¢@ on E be defined 
by $(v) = v7.» (inner product of v and vo), and let ¥ be the function ¢gof, 
induced on M. 

Then the following two statements hold: 


(A) The set D of points y of B, with v(y) = vo and the set C of critical points 
of ~ in M are in one-one correspondence under the projection p. 


(B) All critical points of ¥ are non-degenerate. If y = (x, v9) is a point of D, 
then the local degree d(y) of v at y and the index j(x) of ¥ at x are related by 


(5) d(y) = (—1)"**, 




















D, 








IMMERSION OF MANIFOLDS 533 


To prove (A), we note that x is critical for y if df(M,) is orthogonal to vo, 
that is, if among the points y € p~ (x) there is one with »-image vo. The proof 
of (B) is subtler. Let y = (x, v9) be a point of D. The determinant J(y) of 
the differential of » at y (with respect to the given orientations of B, and 
S*") can be interpreted as follows. The matter being local, we restrict to a 
small neighbourhood V of x in M. Let L(vo) be the subspace of E* spanned by 
f(M,) and vo, and oriented accordingly. We follow the map f of V into E* by 
the orthogonal projection into L(v9), obtaining an immersed manifold V’, 
with x’ corresponding to x. Then J(y) is the ordinary Gauss-Kronecker 
curvature of the hypersurface V’ of the Euclidean space L(v») at x’; of course 
vo is automatically the positive normal of V’ at x’. With respect to a suitable 
co-ordinate system in L(vo), with the origin at x’ and with v9 as the last co- 
ordinate vector, V’ is described by a function g, in the form 


batt = g(t. ‘ees th). 


It is clear from the construction that g is essentially the function y; in detail, 
if we write f’ for the map from V to V’, we have ¥(z) = g(t:(f’(z)), . . . ta f’(2))) 
for every z in V. Moreover, since df(M,) is the tangent space to V’ at x’, the 
Taylor expansion of g begins with the quadratic terms. The Gauss-Kronecker 
curvature is (—1)".4 times the determinant of the quadratic form; the non- 
degeneracy implies that it is not zero. Let the quadratic form be diagonalized, 
so that 


2 2 2 
g(ts,..- tn) = — Asti — -.. — APF H Ajahn +--+ Ante Ht... 

with all the A; > 0. Then the curvature is (—1)"*/ 4, ...A,,; this is then also 

J(y). The degree d(y) of v at y is the sign of J(y): 

(6) d(y) = (-—1)"*". 

On the other hand, since g is just ¥, the above form of g shows that the critical 

point x of y is non-degenerate and that its index is j: 

(7) j(x) = j. 

Together, (6) and (7) prove (B). 

8. We apply the theory of Morse (4). The sum >} (—1)/®, extended over 
all critical points of y is the alternating sum of the type numbers, and therefore 
equal to the characteristic x of M. By (A) and (B) we have then x = >> (—1)" 
d(y) = (—1)"-> d(y), where the sum extends over all points y with v(y) = v9; 
but by well-known principles the sum >> d(y) is exactly the degree of the map 
v, that is, the normal degree n, of f. We have then 

x = (— 1)" nN 7. 


This proves Theorem II, since for odd m one knows that x = 0. 











534 HANS SAMELSON 


REFERENCES 


1. S. S. Chern and R. K. Lashof, On the total curvature of immersed manifolds, Amer. J. Math., 
79 (1957), 306-318. 

2. —v- On the total curvature of immersed manifolds, II, Mich. Math. J., 5 (1958), 5-12. 

3. R. Lashof and S. Smale, On the immersion of manifolds in Euclidean space, Ann. Math., 68 
(1958), 562-583. 

4. M. Morse, Calculus of variations in the large, Amer. Math. Soc. Coll. Pub., 18 (1934). 


University of Michigan 























THE ENUMERATION OF POINT LABELLED 
CHROMATIC GRAPHS AND TREES 


T. L. AUSTIN 


1. Introduction. Given n points with c¢ of one colour, ¢ of another 
colour, up to & colours, linear graphs are formed with the restriction that no 
line connects points of the same colour. Following fairly standard terminology, 
coloured graphs with this restriction will be called point chromatic graphs. 
Giving the points numerical labels running from 1 to c, for points of the ith 
colour (i = 1,2,...,) forms point labelled chromatic graphs. Note that 
this description is slightly different from assigning a label and a colour inde- 
pendently to each point. If there are N graphs of the first kind and N* of the 
second the relationship is 


In this paper, relationships for enumerating these graphs are derived, along 
with simple expressions for the number of such trees (connected graphs with 
no cycles) and the number of such connected graphs of two colours with a 
single cycle. 


2. Generating functions. Let N(c, @,...,c;/) be the number of 
point labelled chromatic linear graphs with / lines, c,; labelled points of one 
colour, ¢z labelled points of another colour, etc., connected or in several parts, 
and let C(c:, c2,..., Cx; 1) be the number of such graphs that are connected. 
For convenience these will be written in vector notation, for instance, 
N(e1, Co,.--, 0239 = N(C,D with C = (ec), ce, ..., Ce). 

It is clear that 


y= (a"- & 4) 
(1) N(C, 1) = 2 i=l 9 M= CO, + Cot... +c, 


since the fully connected labelled chromatic graph, that is, the graph with a 
line connecting each pair of differently coloured points has c,(m — c,) lines at 
points of the ith colour, hence has 


$([2 — cyle, + [m — coleo +... + [m — Clex) 
$(nl[er +e2 +... +6] - [ci + ce +...+c4]) 


He-E4) 


Received June 3, 1959. 


535 











536 T. L. AUSTIN 


lines in all. N(C, 1) obviously arises by taking ordered selections of / lines from 
this graph. hence (1) follows immediately. 
Likewise 


ea oe 
(1a) N'(C,1) = (, a * ‘) stalin ) 


with the prime indicating that lines in parallel are allowed follows in the 
same way, since this is just the form that (1) takes if the selection is done 
with replacement. 

Introduce the exponential generating functions of the numbers N(C, 1) and 
C(C, )): 


(2) N(x1, X2,...,%232) = N(X, 2) 


oo ow oo }[n?@—Zc3] cl. e2 . Ch 
x1" Xe x 
= 2--D Db aoa. as' NC) 
ci=0 co=0 ck=0 l=0 Ci: C2: Cr: 


(3) C(x1, X2,..., X32) = CCX, 2) 
a 1 co §(n2—Ze}) ci. co. ch 
=> >...5 EY 3%*2'cc,) 
ci=0 com ck 0 bm Ci: Ca. Cp: 
with V(0,1) = C(0,/) = 0 by convention. Then the relation between the 
known N and C is given by 
THEOREM 1. C(X, 2) = log[N(X, z) + 1). 


The proof follows from a slight modification of an argument by Gilbert 
(1). Thus represent by N,,(C, 1) the number of graphs which are in m parts. 
A graph enumerated by N,,(C, 1) evidently consists of m distinct connected 
graphs. 

Thus, any point is in a connected graph with i; points of the first colour, 
ig points of the second colour, etc., and with j lines, and the other (c,; — i,), 
(C2 — t2),..., (Ce — %) points and / — j lines are in a graph consisting of 
m — 1 parts. The points can be labelled in 


ti] \io/ ** \ty 
ways and therefore 


= Do Clin in... 5 ins Z)Nm—1(C1 — in, C2 — in). Ce — tel — 9) 
J 


aa «()G)---@)- 


Next, choosing one of the points in the graph in m — 1 parts and repeating 











~~. 





m 


ne 


he 


rt 


ng 











CHROMATIC GRAPHS AND TREES 537 


the argument, it follows that (with superscripts to distinguish the connected 
graphs) 


(5) Na(C, 2) 


1 ad *(1) (i 1) 2) (2 (2) 
=< 7 C(ij PPT wt a yC(ay”, q”,..., f°: 7) 


yO), 702), fd, sa) 





r (1) (2) (1) (2) -(1) (2 
X Nm-2(er — i” — 82°, 6.6 Ge — ig? — 431 — Gf — 7) 
x ¢}! Cc! 
i —- IK 760) Ty) 
!(c¢y — 4,’ — in”)! hh la !(qQ—i% — i’)! 


with the factor 4 entering to account for order between the two distinct 
connected parts. Continuing this process, one obtains 


l . : : im), (mi 
6) Nm(C,2) = FF oy De egy CPG PICs §®) «CU 5) 


! ! ! 
x Ci. Ce: Ce 
(I), C2) “(m) y “I (2 (my * 1 ic 
11 14; rr .F : és ” is | ik ~ ° Til 





Multiplying both sides of (6) by 


a . 6 Ck 
Xi Xe Xe I 
— oo —“_¢ 


al ca ei 


and summing in turn over ¢), C2,..., Cc, and / gives 
wo ${n*—Ze3) 


-_—e? oe str... Sra! MalCs 8) 


ci=0 co=ml ck=0 j=0 c,! Ce! 


= N,,(x1, X2,...,%238) = N,(X;8) = Cc™(X, 2). 


l 
one! 


Here Ny = C® = 0. Then since 
N(X;2) = > N,.(X, 2) 


and by (7), the thereom follows immediately. 


3. Trees. A tree is a connected linear graph with no cycles (closed path of 
points). Obviously this is the special case C(C, m — 1). 


THEOREM II. The number of point labelled chromatic trees, C(C,n — 1), on 
n points with c, labelled points of one colour, cz labelled points of another colour, 
with k colours in all, is 

C(C, n — 1) = n(n — c:)* "(mn — c2)?.. . (mn — ye). 


Form the fully connected point labelled point chromatic graph on 
n=¢€,+¢2+...+c points and consider the matrix A, with 


ai 


(the number of lines at point 7) 1 = 1,2,...,m 
— (the number of lines between points 7 and 7), 1 # /. 











538 T. L. AUSTIN 


Since all points are distinguishable either by label or colour, there is no am- 
biguity in the above definition and in fact 


ay; = (m — c}) 


_ §—1 if points i and j are of different colour 


a;, = 
7 \ 0 otherwise. 


Where c,’ is the number of points of the same colour as point 7. 
For concreteness, this matrix with c; = 1, ce = 2, cs; = 3, nm = 6 is 


5 -—-1 -1 -1 -1 -1 
—1 4 0 -1 -1 -!1 
—1 0 4 —-1 -1 —1l 
-1 -1 -1 3 0 0 
-1 -1 -!1 0 3 0 
-1 -1 -1 0 0 3 
Since all rows and columns of A sum to zero, all (n — 1) X (m — 1) co- 


factors are equal. Trent (4) has shown that this cofactor is the number of 
trees in a fully connected graph which has the incidence matrix B, with 


bi: = 0 
by = — ij, 1 Fj. 


Thus to enumerate trees, it is necessary to evaluate a determinant of the 
form 


vil (d;) — Vie — Vis eee —Vx 
— Va 21 (dz) 

: = Det 
— Vis eee Yel (dy) 


with V,,(dgad,;) = {v4}; v4; = 1 all i, 7, the y's scalars, and [(d,) the identity 
matrix of dimension d;. 

Since TI is symmetric, it is necessary only to consider its latent roots to 
evaluate its determinant. Hence consider the system of equations 


Y; Y; 
Y2 Y> 


(8) ri. = 


Y; | Y, 
with Y, = Y,(d x1) = (va, Vi2y +. » Viai)’. 
If X is different from each of +; ,y2,..., 7, then a consideration of the 


equations two at a time shows that each vector Y; must be a constant vector. 
Thus (8) can be written as 











CHROMATIC GRAPHS AND TREES 539 


m- k 
(9) > (yy + dyby — dV, = GY, (i =1,...,k) 


j=l 


which contains exactly k independent equations. This accounts for k latent 


roots, each different from 1, y2,..., Ye and the product of these & roots is 
the determinant D = |(y7; + djjéda — d;l. 
Next if A = y: (say), then clearly Y,2 = Y;=...= ¥, = 0, } yx, = 0, 


the root y; occurs d,; — 1 times, and similarly for \ = y2, A = y3, etc. There- 
fore it follows that 


(10) Det P= of yf"... of. 


Substituting in (10) the values for an (m — 1) X (m — 1) cofactor of A, 








1 = (m — C1), V2 = (WM — Ce),..., Y¥, = (nm — &); 
d; = c¢; — 1, d: = C2, d3 = C3 ad, = ¢, 
(10) becomes 
cO- | 
of } | (m — €1) — Co — Cg... —& | 
a - ~1| (1 — €1:)(m—Ce2)—C3... — cI 
(11) (m — c;)"~* (nm — c2)*"... (wn — ( : )( : » 
| | 
(l—c,) — Co—c3... (n—G) | 
the The determinant can be evaluated simply by writing it as 
n 
=a —C2 —C3 —Cp ° \ 
lime -c —e —¢ . 
(12) | ' *|+ = Det A 
laa 
l—c; -—Co —€3...—-C n 
which can be expanded about the terms of the diagonal matrix, that is, 
ity 
: (13) Det A = n* — n*"'(c, +o +... +) 
a + n*—*[c1c2 + c2(1 — C1) + Cres + c3(1 — c3) +... 
wee tee + (1 — &)] 
which are all the terms since the first matrix is of rank two. Using the relation 
GQ t+eet...+c =n, (13) becomes 





’ (14) Det A = n**[co + e3 +... +] = n*?(n — c). 


Putting (11) and (14) together the theorem follows. 


4. Connected graphs with one cycle. After trees the next graph in 
order of complexity is a graph with one cycle. This section derives the number 
the of point labelled chromatic graphs with two colours and one cycle. For con- 
tor. } venience, the notation of this section is different from that of the previous 
sections. Note that lines in parallel (cycles of length 2) are allowed. 














540 T. L. AUSTIN 


Before stating the main result of this section it will be necessary to establish 
a few preliminaries. 

A rooted tree is a tree in which one point, the root, has been preferred. In a 
labelled tree on m points, any point may be preferred and all points are dis- 
tinguishable, hence there are m rooted trees for each root free tree. 

Now let R,, be the number of rooted chromatic trees, with p labelled black 
points and g labelled white points, and let 7,, be the number of such root 
free trees. Then since all points are distinguishable and since each point may 
be preferred in turn as the root, it follows that 


(15) Rog = (0 + Q)T 
PT 5g + 97 nn 


, 
= Tq t Tq 


with r,, the number of trees with a black root and r,,’ the number with a 
white root. 


Introd uce_the counting series, 


x” y* 

(16) R(x,y) = ede Ree 
P .@ 

(17) T(x, y) = »» ps ogi Ty 

(18) r(x,y) = >> } pi gi 
. x” y! 

(19) r’'(x,y) = p> p> pi gi '® 


and let r,,(m) be the number of trees with a black root and with exactly m 
lines at the root. Then a slight modification of an argument by Polya (2) 
also given in Riordan (3, page 127) yields a relationship between r(x, y) and 
r'(x, y). Thus 


(20) To(l) = P p14 


because there are p labels for the root and the single line connects the root 
to a white point which may be regarded as the root of a tree with p — | 
black and g white points. 

Similarly, 


‘ 1S & (e-1\(a)\, 
(21) t(2)= SP De » F - | Tp—1—1, 3-9" 8) 


i=l j=l ] 
1 & —1\,, me 
=20E (°F "ier t 


symbolically, with [r;,]’ = ry and ro’ = ro, = 0, because the labels for the 








he 








CHROMATIC GRAPHS AND TREES 541 


two (white rooted) trees added to the two lines at the (black) root may be 


chosen in 
(5 ")() 
i j 
ways when one of the trees has i black and j white points. Because of the 


labels, the added trees are not alike. The factor (1/2!) accounts for order 
between the two lines. In general, it is clear that 


—-1 , , , q 
(22) Tpq(%) = £> .. ie i ) [r44 + T ia» + **-* + Tp—1-11— ° in—1} 


since the number of ways of assigning labels to trees with 4), 2, . . . black and 


ji, jo, ... white points is 


Ge Mates sie): 
a ee Cs ees 


From 739 = 1 and since obviously 
'» = > Tpq(m), 
— 


it follows from (22) at once that 

(23) r(x, y) = x expr’ (x, y) 

and by an exactly similar argument, 

(24) r(x, y) = yexp r(x, y). 
From (23) and (24), 


(25) r(x, y) = x exp [y exp r(x, y)] results. 
Recalling the Lagrange formula, 


t= x o(t) 
f(t) =f) + >= vo vl (t) 6"(t)] m0, 


and putting f(t) = t, t = r(x, y), 


o(t) = o(r[x, y]) = exp [y exp r(x, y)], 


it follows that 


x” 
(26) (xy = > DO Stee 
7 7 Pp: 
and similarly, 
2? ,@ 
(27) ’@n= > > Het?” 











542 T. L. AUSTIN 


From (26), (27), and (15), 


(28) Te y= DEX ere 
D q p: q: 
follows. This is the result (for the case of two colours) of Theorem II. 
The result may now be stated. 


THEOREM III. The number of connected point labelled chromatic graphs with 
b labelled black points and q labelled white points with one cycle of length 2m is 


4(p + q — m)(P)m(q)mp? "gq?" for m > 1 
and 
(p + q — 1)p*"'q?" for m = 1. 


To enumerate (two coloured) labelled chromatic graphs with one cycle. 
note that the cycle must be even, with the same number of white as black 
points because of the chromatic condition. The essential enumeration is there- 
fore of such graphs with a single cycle of length 2m. These graphs may be 
enumerated by a theorem due to Polya (2; 3), since they may be regarded as 
being formed by placing white rooted trees (say) at the vertices of an m-sided 
polygon and then replacing the lines of the polygon by roots of black rooted 
trees to form a polygon of 2m vertices. Every permutation of labels for the 
black roots clearly gives rise to a different polygon of 2m points while, on the 
other hand, any permutation of labels for the white roots which is a reflection 
or rotation of the original m-sided polygon is equivalent to some permutation 
of the black labels. Thus the different cycies formed are the result of a direct 
product of two groups, the identity group and the dihedral group. 

Letting 


7 
a gi V2 < 
d 2m (U1, U2, 01,02) = de ui a 5! days (2m) 
a.B.7 1 6! 


be the enumerator of such graphs by number of points and point labels, then 
by the theorem and from a simple extension of a problem in Riordan (3, 


158), 


(29) dom (U1, Uo, V1, Ve) = Im (rts, Ue, V1, V2] )Dm(r'"[u1, Ue, V1, V2], 


r'[us, u2],..., 9’ [ut, ur), for m > 2; 
dom (61, 42, 01, V2) = Sm(r[uy, U2, V1, V2), r[ui, us), ..+,7[uy, us |) 
Smn(1’ [t41, U2, 01, V2), 7’ (uj, u3),..., 7’ (ut, ut]) form = 1, 2, 


with r(u, U2, 01, V2) and r’ (U1, %2, V1, V2) the enumerators of (black and white) 
rooted trees by number of points and number of point labels, 


r(uy, U2) = r(u4, Ue, 0, 0), 








ith 
1 is 


hen 


ite) 





CHROMATIC GRAPHS AND TREES 543 


I,(t) = #, the cycle index of the identity group and D,,(t;, ts, ..., tm) the 
cycle index of the dihedral group, 
1 ~~ sm 1 (m— 2 
Dmltis ty «++ tm) = 5 > o(i)tt!* + 5 tits = for m odd; 
= > (iit! + 1 pmir—1(y2 + te) for m even 
2m E 4 ‘cu , 


Here (i) is the Euler totient function, the number of integers not greater 
than i and relatively prime to 7, 





l m! 7 32 jm 
S(t, tay «+ + + tm) = 55 im jp. Fate Ht well 


is the cycle index of the symmetric group, the sum extending over all solutions 
of 


Ri t Bot... + Mjn = Mm. 


The substitutions x = 1301, y = Uae in the definition of do», (1, %2, 01, 2) 
change it to the form 


x 
dom (U1, U2, x, y) - } a ui 7 Zz dye s.besnd gr . 
i? > OL 


Therefore, the numbers required, d,s,; are enumerated by 
7.8 
xy 
dom.o(x,¥) = Dy dyiys vi 5 


which from (29) is 


dom.o(x, Y) = se r(x, y)r'™ (x, y) form > 1 
= r(x, y)r’(x, y) form = 1. 
Recalling 
(25) r(x, y) = x exp [y exp r(x, y)], 
and using the Lagrange formula again with f(t) = ¢", one obtains 
(30) r"(x,y) = > p> a (p) mpg” m. 


Also from (24) 


r™ (x, y)r’'™ (x, y) = r(x, y)y™ exp m r(x, y) 


ll 
ba 
3 
_ 
ba 
te 
Xe 
3 
Ms 
13 
IS 
te 
—is 
| 
Ne 


ll 
2 
3 
a 
3 
aj 
Ay 
2 
on 
3 


r™ (x, y)r"™" (x, ¥) 


and by (30) 








544 T. L. AUSTIN 


[r(x, y)r’ (x, y)}” 


ll ll 
*; <; 
MM ™M 
“-M -M 
rR BIA 
hm, 2h 
> % 
| | 
s ™M 
¥ _-~ 
2 
M = 
re = 
om 
| oS 
= 3 
eM ; 
x 3 
s 
3 i 
I = 
s 
3 


Setting » = n — m+ m, we have 


= (ton) e 


n— 


> eS (n — m)q’"m"" +m >. ( — ™) me 


ns n—m™ 
= o- me (Conor tm Oot) erw 


(p — m) m (q +m)?" + m (q+ m)”™ 
= m(q+m)"”™" (q+?) 


and therefore 


(31) [r(x, y)r’(x, y)}” 


oP ost , 
” > — = p*" (p)mm(q + m)”™" (p + q) 
Pp 7 Pq: 


x” y* q—m—1 p—m-—1 


Pp 


From (31) it follows at once that 


1 x? q —m—1_ p—m-— 
(32) dol 9) =5 2 De i 7 (> +9 — m) (b)m(Q)m po" 


? 


form > 1 


Pf 
Lan eta-ve for m = 1. 


5. Remarks. Consider a fully point labelled but otherwise unrestricted 
tree on m points(with no chromatic conditions). Construct from it a point 
chromatic tree as follows. Select any point and give to it one of k colours. 
Give each point which shares a line with the first point one of some other 
k — 1 colours arbitrarily, and continue until all points of the tree are coloured. 
No two points which share a line can have the same colour from this pro- 
cedure and so the result is a point labelled point chromatic tree with the 
numerical labels running from 1 to m. On the first step, there are k choices of 
colours and k — 1 choices on each of the succeeding steps, hence a total of 
k(k — 1)" choices in all. From a result of Ca’‘ley's, it is known that the 
number of point labelled trees is n"-*. Therefore the total number of distin- 


guishable point chromatic point labelled trees that can be formed from the 
above procedure is k(k — 1)"—'n"-?. 





- 


—— «= 








CHROMATIC GRAPHS AND TREES 545 


Next if the trees enumerated by C(C, » — 1) of Theorem II are re-labelled 
by running the numerical labels from 1 to m and assigning them to the points 
independently of the point colours, there are 


C1! Co! . . . G! 
such assignments and so it follows at once that 
n! ‘pe 
. . \n n . 
(33) > Zl cal et V Met Ot + + ei — 1)=k(k-—1)" nn 
1- Co... « Cy: 


with the sum running over all partitions of m into k parts. Equation (33) 
provides a useful check on the enumeration and leads to the interesting identity 





n! < 
(34) >) —— (in — 1) (mn — 2)... (n — 
1: oo « Cp: 


or, for the more manageable case k = 2, 


n—l 
(34a) * (”) go! (n = iy - on™? 


i=1 1 


REFERENCES 


1. E. N. Gilbert, Enumeration of labelled graphs, Can. J. Math., 8 (1956), 405-411. 

2. G. Polya, Kombinatorische Anzahlbestimmungen fiir Gruppen, Graphen, und chemische Ver- 
bindungen, Acta. Math., 68 (1937). 

3. John Riordan, An introduction to combinatorial analysis (New York, 1958). 

4. H. M. Trent, A note on the enumeration and listing of all possible trees in a connected linear 
graph, Proc. Nat. Acad. Sci., 40 (1954), 1004-1007. 


Technical Operations, Inc. 
Fort Monroe, Virginia 











SOME GENERALIZED THEOREMS ON 
CONNECTIVITY 


R. E. NETTLETON 


The “k-dense’’ subgraphs of a connected graph G are connected and contain 
neighbours of all but at most k-1 points. We consider necessary and sufficient 
conditions that a point be in T,, the union of the minimal k-dense subgraphs. 
It is shown that IF, contains all the “‘[m, k]-isthmuses” and “[m, k]-articu- 
lators’’—minimal subgraphs which disconnect the graph into at least k +1 
disjoint graphs—and that an [m, k]-isthmus or [m, k]-articulator of I, dis- 
connects G. We define “‘central points,”’ ‘‘degree’’ of a point, and ‘‘chromatic 
number” and examine the relationship of these concepts to connectivity. 
Many theorems contain theorems previously proved (1) as special cases. 


1. Definitions. The concepts poinis, graph, and subgraph will be used 
here in precisely the same sense as in a previous paper (1), in which were also 
defined the union, intersection, and difference of two subgraphs, together with 
neighbours, path of length k, diameter of a graph, connected points and graphs, 
m-connected and completely connected graphs, articulator, a subgraph which 
disconnects G, and the partition of a disconnected graph. Unless otherwise 
specified, a connected graph G will have a finite number “‘n”’ of points, and the 
null graph will be assumed disconnected. If G’ and G” are subgraphs, G’(G’’) 
will denote the subgraph determined by all points in G’-G”’ which have neigh- 
bours in G”’. If G” is a single point p, we denote this subgraph by G’(p). The 
number of points in G(p) is the degree of p, and the set of all points in G which 
have a given degree forms a degree class. 

The distance between a point ; and a subgraph S contained in G-f, is the 
smallest positive integer g such that for some point 2 in S there is a path of 
length g connecting p; and p». A subgraph G’ disconnects two subgraphs which 
are contained in separate graphs of the partition of G-G’. An [m, k]-isthmus 
((m, k]-articulator) is a completely connected (not completely connected) 
subgraph G’ which disconnects G and has precisely m points, such that G’ 
contains no proper disconnecting subgraph, and the partition of G-G’ consists 
of at least k +1 graphs. The generic term isthmus will refer to a subgraph 
which, for any m, is an [m, 1]-isthmus. Q, will denote the union of all subgraphs 
which, for any m, are [m, k]-articulators or [m, k]-isthmuses of G. In Figure I, 
the points {2,5} determine a [2, 1]-isthmus, while {2,4} determine a [2, 2]- 
articulator. 


Received March 31, 1959. Research supported by the Robert A. Welch Foundation. 
546 




















SOME THEOREMS ON CONNECTIVITY 547 














2 3 
6 
| 5 4 
Ficure 1 


2. k-dense subgraphs. If & is any positive integer, a k-dense subgraph 
G’ is a connected subgraph such that there are at most k — 1 points of G — G’ 
which have no neighbours in G’. In Figure 2, the points “‘1"" and “4” are 1- 
dense while ‘‘2’’ and ‘3”’ are 2-dense. 














w 
£ 


FiGuRE 2 


Clearly if G’ is k-dense, then G’ is p-dense for p > k. Another obvious result is 


LEMMA 2.1. <A connected subgraph which contains a k-dense subgraph is 
k-dense. 


Let S, denote the set of all k-dense subgraphs having at least k points, plus 
the null graph. A trivial consequence of 2.1 is 


THEOREM 2.2. The subgraphs in S, form a lattice under the relation of set 
inclusion. 


If G is m-connected for m < nm and contains a point p;, G(p:) must have at 
least m points which implies 

THEOREM 2.3. If G is m-connected for m <n, every connected subgraph 
is (n-m)-dense. 

A k-dense subgraph which properly contains no other k-dense subgraph is 
said to be D,-minimal. The reader will easily verify: 

THEOREM 2.4. If a point is D,-minimal, then it is contained in every |m, k}- 


isthmus and |m, k|-articulator. 


If m > 1 and a point is not a [1, 1]-isthmus, G — p is 1-dense and hence 
k-dense. If p is a [1, 1]-isthmus and the partition of G — p contains precisely 











548 R. E. NETTLETON 


q + 1 graphs, g of which contain fewer than k points among them, then the 
remaining graph in the partition is k-dense, which cannot be the case if every 
q graphs in the partition contain at least k points. This proves: 


THEOREM 2.5. For n> 1, the intersection of all the k-dense subgraphs is 
precisely the subgraph G’ of (1, 1]-isthmuses with the property that for any (1, q}- 
isthmus G"’ in G’ such that G” is not a [1, g + 1]-isthmus, any set of q graphs in 
the partition of G — G" contain at least k points among them. 


We shall now show how the assumption that G is k-connected or that a 
point is D,-minimal restricts the distances between points of G. The associated 
number of a point p; is the greatest positive integer d such that for some point 
pe in G, d is the distance between p; and 2. A point having minimum asso- 
ciated number is called a central point of G. 


LEMMA 2.6. Suppose G is k-connected. Let p be a point in G with associated 
number d, and define G’(p), r = 1,...,4d, to be the set of all points distant 
r from p, while G°(p) = p. It follows that:' 


1. G= > G’(p). 


red 


2. Fori<r<dands > 0, G' (p) disconnects G'** (p) from all G‘ (p) for 
whichi<r-— 1. 
3. For eachr,1 <r < d,G" (p) contains at least k points. 


For any point p; + p, the distance of p; from p is < d and thus /, is in 
some G‘(p) for i < d, which proves Part 1. Let {po = p',..., p* = p’} 
determine a path connecting po in G‘(p), for i < r, to p’ in G’ + * (p). If p* is 
distant < r and p‘*!' distant > r from p, then there exists a path {p = q', 

.,q' = p*} such that ¢ < r and {q',...,q', p'*"} isa path of length <r 
connecting » and p‘+', which is a contradiction. Therefore by induction it 
follows that no path with » as initial point can contain a point in G‘(p) for 
i >, if it contains no point in G’ (p)—which proves Part 2. Since G is k- 
connected and G’(p) disconnects G for 1 < r < d, G’(p) must contain at 
least k points. 


THEOREM 2.7. If G is k-connected, S is an r-dense subgraph for r < mk + 1, 
m being a non-negative integer, and p is a point in G — S, then the distance 
from ptoSis < m+ 1. 


By 2.6, if the distance from p to S is > m + 2, there exist disjoint sub- 
graphs G‘(p), 7 = 0,..., m, in which no point has a neighbour in S and m 
of which have at least k points each, contrary to the assumption that S is 
r-dense for r < mk + 1. When S consists of a single point and m = 1, 2.7 
implies 


‘The summation sign will be used to denote the union of all the subgraphs designated as 
summands. 

















SOME THEOREMS ON CONNECTIVITY 549 


THEOREM 2.8. If G contains a point which is D,-minimal, the set of central 
points is precisely the set of D,-minimal points, and if G contains no point which 
is D,-minimal, but G 1s k-connected and contains a point p which is D,-minimal 


forr < k + 1, then p is a central point of G. 


THEOREM 2.9. If G has an isthmus or articulator G', and also a point p, 
with associated number d, and a point p2 distant d from G’, then there is a graph 
G” in the partition of G — G’ such that all the central points of G are contained 
in the union of G’ and G”. 


Suppose p2 and a point p; are in two disconnected graphs of the partition 
of G — G’. Then the distance from pz: to p; is at least d + 1, and p; is not a 
central point. 


3. Union of the D,-minimal subgraphs. The symbc' “TI,” will 
denote the union of all the D,-minimal subgraphs of G. 


LeMMA 3.1. For given k, an |m, k]-isthmus or |m, k|-articulator of T, dis- 
connects G. 


Let I’ be an [m, k]-isthmus or [m, k]-articulator of T, and suppose G — I’ 
is connected. Every point in I” has a neighbour in G — I”, for I’ can have 
no proper subgraph disconnecting IT, and thus G — I” is 1-dense and contains 
a D,-minimal subgraph G’. But G — I’ — (G — Ty) = IT, — I” is not con- 
nected, and its partition consists of at least k + 1 graphs, so that it contains 
no D,-minimal subgraph. Thus G’ must contain some point of G — T,, which 
is a contradiction. 

If G is m-connected, no [m, k]-articulator or [m, k]-isthmus of T, can have 
a proper subgraph which disconnects G. Thus we have proved: 


THEOREM 3.2. If G is m-connected, an |m, k\-articulator or |m, k\-isthmus 
of T, is respectively an |m, 1|-articulator or |m, 1]|-isthmus of G. 


THEOREM 3.3. If S is any subgraph determined by k points and containing 
no D,.-minimal subgraph, then T,.(S) disconnects G. 


If G — T,(S) is connected, it is k-dense and contains a D,-minimal subgraph 
in which some point not in S must have a neighbour in S, which is impossible. 


THEOREM 3.4. If G is m-connected and k < m, then T, contains at least 
m —k + 1 points. 


If there exists S satisfying the hypothesis of 3.3, then I,(S) contains at 
least m points as does I. If every subgraph of k points contains a D,-minimal 
subgraph of G, G — I, contains fewer than k points, and so [, contains more 
than m — k points, that is, at least m — k + 1 points. 

We now consider general conditions under which articulators and isthmuses 
of G are contained in I,. 











om R. E. NETTLETON 


THEOREM 3.5. If G’ is an articulator or isthmus such that the partition of 
G — G’ consists of precisely q graphs, every q — 1 of which contain at least k 
points among them, then G’ is contained in Ty. 


If p is a point in G’, then G — G’ + p is 1-dense and contains a D,-minimal 
subgraph which cannot be contained in G — G’. An immediate consequence is 


THEOREM 3.6. Q, is contained in T,. 


In proving the following theorem, we use the results, proved in (1), which 
assert that if G is not completely connected, it contains a disconnecting 
subgraph, and a proper subgraph which disconnects G contains an articulator 
or an isthmus. 


THEOREM 3.7. If Gis not completely connected and for given k, every articulator 
or isthmus is respectively, for some m, an |m, k|-articulator or [m, k]-isthmus, 
then T, is contained in Q,. 


Suppose f is a point in I,. If p is D,-minimal, p is contained in every [m, k]}- 
isthmus and [m, k]-articulator by 2.4, and since G is not completely connected, 
Q, is not null. If p is not D,-minimal, let I’ be a D,-minimal subgraph con- 
taining p. Then either I’ — p is not connected or there is a point p; in G — I” 
such that /; is a neighbour of » and of no other point in I’. In the first of 
these two cases, G — I’ + p disconnects G, and therefore contains an [m, k]- 
isthmus or [m, k]-articulator which must contain p since there are at most 
k — 1 points in G — I” which have no neighbours in I’. In the second case, 
G— I’ —p,+p disconnects G and contains an [m, k]-articulator or 
[m, k]-isthmus which must contain p by the same reasoning. For the case 
k = 1, 3.6 and 3.7 jointly imply: 


THEOREM 3.8. Jf G is not completely connected, then Q, = T;. 


We now proceed to obtain necessary conditions and sufficient conditions 
that a point be contained in G — I,. 


THEOREM 3.9. If p is a point which is not D,-minimal and is not a [1, 1}- 
isthmus, and whenever G’' is a minimal subgraph with the property that G' + p 
disconnects two points of G(p), G’ also disconnects p from at least k points of 
G — G’, then p is contained in G — T,. 


Suppose a point / satisfies the conditions of the theorem and there is a 
D,-minimal subgraph I” containing p and a point which is a neighbour of p. 
Then either I’ — p is not connected (Case I), or there is a point p; in G which 
is a neighbour of p and of no other point in I’ (Case II). In Case I,G — I’ + p 
disconnects two points of the intersection of G(p) with I’, and since p is not a 
{1, 1]-isthmus, G — I” contains a minimal subgraph G’, such that G’ + p 
disconnects these two points. But if G’ disconnects p from k points of G — G’, 
then I’ is not D,-minimal. Thus I’ — p must be connected, and we consider 
Case II. In this case, G — I’ — p; + p disconnects p; in G(p) from another 











SOME THEOREMS ON CONNECTIVITY 551 


point in the intersection of G(p) with I’. Then G — I” contains a minimal 
subgraph G’ such that G’ + p disconnects these two points, but by the same 
reasoning as in Case I, G’ cannot disconnect p from k points of G — G’ if TI’ 
is D,-minimal. Thus » is contained in G — I,. 


THEOREM 3.10. A point p is contained in G — T, only if whenever G’ is 
a minimal subgraph with the property that G' + p disconnects at least k + 1 
points of G(p) from one another, G’ also disconnects p from at least k points 
of G — G’. 


If G’ is a minimal subgraph with the property that G’ + p disconnects k +1 
points of G(p), then every point of G’ is connected to at least one of these 
k +1 points by a path which, except for its initial point, contains only points 
of G — G’. Otherwise G’ would contain a proper subgraph with this property. 
Accordingly, G — G’ contains a k-dense subgraph if G’ does not disconnect p 
from at least k points in G — G’. But G — G’ — p cannot contain a k-dense 
subgraph, since it contains k + 1 points no pair of which is connected. Accord- 
ingly p belongs to a D,-minimal subgraph. For the case k = 1, 3.9 and 3.10 
yield 


THEOREM 3.11. A point p which is not a (1, 1]-isthmus and is not D,-minimal 
is contained in G — T; tf, and only if, whenever G’ is a minimal subgraph with 
the property that G’ + p disconnects two points of G(p), G’ also disconnects G. 


In a previous paper (1), it was shown that every connected graph contains 
at least two proper 1-dense subgraphs. This result will now be used to prove. 


THEOREM 3.12. Jf G — Ty contains fewer than k points and the D,-minimal 
subgraphs are mutually disjoint, then no D,-minimal subgraph can have more 
than k points. 


Suppose S;,..., 5S, are the D,-minimal subgraphs containing more than one 
point, and let S, contain g > k points. Then there is a point p,’ in S, such 
that S,’ = S, — p,’ is a proper connected subgraph containing g — 1 >k 
points. Then each of the S; (i = 1,...,7 — 1) being k-dense, must contain a 
point p, having a neighbour in S,’. There exists a proper subgraph S,’ = S, — 
p, containing p,; and all but one of the points, p,’, of S;, since S,; contains two 
proper connected subgraphs. Consider the subgraph 


r 


S=>) Ss, 


i=l 


S is connected since each of the S,’ is connected and all the S,’ for i < r contain 
neighbours of points in the connected subgraph S,’. Since every D,-minimal 
subgraph containing only one point must have a neighbour in S,’ and since 
each of the points ,’ has a neighbour in S,’, S contains neighbours of all the 
points in IT, which are not in S. Since G — I, contains fewer than & points, S 
is a k-dense subgraph containing no D,-minimal subgraph, which is impossible. 











552 R. E. NETTLETON 


In Figure 3 is shown an example of a graph in which all the conditions of 
3.12 are satisfied for k = 2. The points numbered “1, 4, 6’’ are each D,- 
minimal, as is the pair {2, 3}. The point numbered “5” is the only point in 
G — Pr:. 





FIGURE 3 


4. Colour Classes. A colour class is a subgraph no two points of which 
are neighbours. We shall study the relationship of this concept to connectivity, 
a relationship also investigated by Dirac (3; 4). A k-colouring of G is a set of 
k colour classes such that each point of G lies in one, and only one, class of 
the set. The chromatic number of a subgraph G’, which we shall denote by 
c(G’), is the smallest integer k such that there exists a k-colouring of G’. The 
graph in Figure 3 has chromatic number 3, a 3-colouring being formed from 
the colour classes {1, 5}, {3, 4}, {2, 6}. 


THEOREM 4.1. Jf G is m-connected for m < n: 

1. c(g) > n/(n — m). 

2. If for any positive integer k, G has diameter > 2k + 1, then c(G) < 
n — 2(mk — m+ 1). 

3. If G has diameter d = k + 1 for k > 3, no point of G can have degree 
> n — m(k — 3) — 3. 

4. The diameter d of G satisfies the condition: 


d—4< (mn — m — 3)/m. 


By 2.3, there can be no set of more than m — m points in which no pair are 
neighbours which implies Part 1. 

To prove Part 2, let p be a point in G with associated number d > 2k + 1. 
By 2.6, there is sequence of subgraphs G’(p), 0 < r < d, such that G‘(p), 
i > 0, has no neighbour in G***(p), while from Part 3 of 2.6, each G‘(p), 
1 <i < 2k, contains at least m points. Thus we can form m colour classes, 
each containing one point from G'(p), G*(p),..., G*-'(p), and m other 
colour classes, each containing a point from G*(p),..., G*(p). The point p 
can be assigned the same colour as a point in G?(p), while a point in G* + '(p) 








4\ 


ce 


Dp _ _ 


er 


A 





SOME THEOREMS ON CONNECTIVITY 553 


can be assigned the same colour as a point in G* — '(). If an additional colour 
is given to each of the m — 2(mk + 1) points of G not already coloured, we 
obtain an [m — 2 (mk — m + 1)] — colouring of G. 

If p’ is a point in G’(p) for 2 < r < k — 1, p’ has no neighbours in G*(p) 
fors <r — lands > r + 1. Thus, including p’, there are at least m(k — 3) + 
3 points in G which are not neighbours of »’, which accordingly is of degree at 
most » — m(k — 3) — 3. If p’ is in G’(p) for r = 0, 1, then p’ has no neigh- 
bours in G*(p) for s > 3, and thus ’ is of degree at most m — m(k — 2) — 
2 <n — m(k — 3) — 3. A similar argument applies when p’ is in G*(p) or 
Gt +1 (p). 

Part 4, for the case d > 4, follows from Part 3 and the observation, from 
2.3, that if G is m-connected, no point has degree < m. If d < 4, the theorem 
is obviously true unless m > n — 2, in which case d < 2. Then d —4 < — 
2< — 2/m <(n — m —3)/m since m <n — 1. 


THEOREM 4.2. If G contains a subgraph G’ with the property that c(G’) = m, 
the partition of G —G’ consists of gq > 1 connected graphs Gi=1,..., q), 
and r = max {c(G,)}, than c(G) << m-+r. 


The reader can readily show that there exists an r-colouring of G — G’ 
which, combined with an m-colouring of G’, gives an (m + r)-colouring of G. 

THEOREM 4.3. For k an integer > 1, if G contains at least 2 k degree classes, 
c(G)<n—k+1. 

Let k degree classes D,(i = 1,..., k) be arranged in order of decreasing 
degree, so that the degree of a point in D, exceeds the degree of a point in 
D,+m by at least 2m. From each D,; choose one point p; (i = 1,..., k). We 
can choose pis in G (p1) — G (p2) — po. Assuming we have chosen j — 1 
distinct points pie, . .. , P1y in such a way that ;, is in 


G(p1) — G(p,) — Dp: (y=2...., j), 


i<r 
then since p; has at least 27 neighbours which are not neighbours of pj: 


and we have already chosen at most 27 — 1 of these, we can choose (yj, 441, 
distinct from ;, for r < j, in 


G(p1) — G(pus) — D Pe 


i<j+1 
In this way we can form k — 1 colour classes S; = p; + pi;(i = 2,..., k) 
each having two points. We obtain an (m — k + 1) - colouring of G by colouring 
each of the remaining m — 2(k — 1) points with a separate colour. 











554 R. E. NETTLETON 





REFERENCES 


1. R. E. Nettleton, K. Goldberg, and M. S. Green, Dense subgraphs and connectivity, Can. | ' 
Math., 11 (1959), 262-268. 

2. F. Harary and R. Z. Norman, The dissimilarity characteristic of Husimi trees, Ann. Math., 
58 (1953), 134-141. 

3. G. A. Dirac, A theorem of R. L. Brooks and a conjecture of H. Hadwiger, Proc. Lond. Math 
Soc., 7 (1957), 161-195. 

4. G. A. Dirac, The structure of k-chromatic graphs, Fund. Math., 40 (1953), 42-55. 


The Rice Institute 





in. J. 


ath., 








ON ORIENTATIONS, CONNECTIVITY AND 
ODD-VERTEX-PAIRINGS IN FINITE GRAPHS 


C. St. J. A. NASH-WILLIAMS 


1. Introduction. The integer part of a non-negative real number / will 
be denoted by [p]. For any integer n, n* will denote the greatest even integer 
less than or equal to , that is, n* = n or n — 1 according as m is even or odd 
respectively. 

The order of a set A, denoted by |A|, is the number of elements in A. The 
set whose elements are a), d2, . . . , @, will be denoted by {a;, a2,..., a,}. The 
empty set will be denoted by A. A set will be said to include each of its elements. 
A set separates two elements if it includes one but not both of them. 

An unoriented graph U consists of two disjoint sets V(U), E(U), the elements 
of V(U) being called vertices of U and the elements of E(U) being called 
edges of U, together with a relationship whereby with each edge is associated 
an unordered pair of distinct vertices which the edge is said to join. The letter 
U, without further introduction, will always denote an unoriented graph. An 
oriented graph N consists of two disjoint sets V(N), E(.V), the elements of 
V(N) being called vertices of N and the elements of E(.V) being called edges of 
N, together with a relationship whereby with each edge \ is associated an 
ordered pair (At, \h) of distinct vertices called the fail and head of \ respectively ; 
the statement that A joins two vertices — and y will mean that either —§ = A 
and 7 = Ak or — = Ahk and n = MM. The letter V, without further introduction, 
will always denote an oriented graph. An orientation of U is any one of the 
2'*")| oriented graphs NV such that V(V) = V(U), E(N) = E(U) and each 
edge of V joins the same vertices in N as in U. 

Let G bean unoriented or oriented graph. Then G is finite or infinite according 
as the set V(G) U E(G) is finite or infinite. Henceforward, except when the 
contrary is explicitly indicated, all graphs mentioned in this paper will be 
finite, and the word “‘graph”’ will mean “finite graph.”” An edge of G is incident 
with each of the vertices which it joins. If S, 7 are subsets of V(G), § will 
denote V(G) — S, S o T will denote the set of those edges which join elements 
of S to elements of 7°, and Sé will denote S oS. The degree of S, denoted by 
d(S), is |S6|. The degree d() of a vertex £ of G is the number of edges incident 
with £; thus d(é) = d({&}). A path of G is a finite sequence 


Eo, A, &1, Ao, Eo, As, “* #9 An En 


in which the &, are vertices of G, the \, are edges of G and A, © {E ya} o [Ed 
(i= 1,2,...,m). A path with first term £ and last term 9 is a &y-path. A 





Received April 13, 1959. 











556 C. ST. J. A. NASH-WILLIAMS 


collection of paths are edge-disjoint if no edge appears in more than one of them. 
The connectivity c(t, ») of two distinct vertices £,7 is the minimum of the 
degrees of the subsets of V(G) which separate them. It can be shown! that 
c(é, 7) is also the maximum number of edge-disjoint y-paths which can be 
found in G. G is k-connected if d(S) > k for every non-empty proper subset S 
of V(G). 

A path 

Eo, Ax, &1, Aa, Ea, As, ~~~ » Any En 


of N is forwards-directed if yt = &;, (and so necessarily \,h = £,) for 
i=1,2,...,m. If SC V(N), an edge d is an exit of S if 4 € S, AA € S, and 
is an entry of S if kk € S, \t € S. The number of entries (exits) of S will be 
denoted by e(S) (x(S)). If £, 7 are distinct vertices of V, a(&, 9) (the coefficient 
of accessibility of n from &) is defined to be the minimum of the values of x(5S) 
as S runs through those subsets of V(.V) which include £ but not ». It can be 
shown? that a(é, 7) is also the maximum number of edge-disjoint forwards- 
directed &y-paths which can be found in NV. N is k-accessible if x(S) > k for 
every non-empty proper subset S of V(N). N is admissible if a(t, 7) > 
[3c(é, 7)] for every ordered pair £, n of distinct vertices of N. 

Robbins (4) proved that every 2-connected unoriented graph has an 
orientation in which every vertex is accessible from every other. Such an 
orientation is clearly l-accessible, since, if SC V(N) and x(S) = 0, no 
element of S is accessible from any element of S. This suggests the generaliza- 
tion that, for every positive integer k, every 2k-connected unoriented graph 
has a k-accessible orientation. (2k-connectedness is of course a mecessary 
condition for possessing a k-accessible orientation, since d(S) = x(S) + x(8) 
for every subset S of the vertices of an oriented graph.) Since an unoriented 
(oriented) graph is clearly k-connected (k-accessible) if and only if c(é, 9) 
(a(é, »)) > & for every pair (ordered pair) of distinct vertices £, 7, our proposed 
generalization of Robbins’ theorem states that, if c(t, 7) > 2k for every pair 
£, n of distinct vertices of U, then U has an orientation in which a(£, 7) > k 
for every ordered pair £, 9 of distinct vertices (k being a positive integer). 
This in turn suggests the following sharper result, which it is the object of 
this paper to prove: 


THEOREM 1. Every unoriented graph has an admissible orientation. 


Robbins’ theorem was extended to infinite graphs by Egyed (2). An exten- 
sion of Theorem 1 to infinite graphs has been obtained, but the details, being 
somewhat heavy, are deferred to a possible future paper. 


‘Since this result is relevant to the present paper only as a slight additional motivation for 
the definition of connectivity, we omit its proof. It can be proved on lines suggested by the 
proof of Menger’s Theorem on pp. 244-247 of (3). 

*This result is mentioned only as additional motivation for the definition of a(é, 7), and its 
proof is omitted. 








| for 
, and 
ll be 
cient 
x(S) 
n be 
irds- 
k for 
n) 2? 


5 an 
h an 

no 
liza- 
raph 
ssary 


x(8) 


‘ten- 
eing 


m for 
y the 


id its 





| 





ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 557 


A vertex of U is even or odd according as its degree is even or odd respec- 
tively. A partition of a set A is a set of disjoint subsets of A whose union is A. 
A pair-set of A is a set of subsets of order 2 of A. If P is a pair-set of A and 
BC A, the subset P, of P is defined to be the set of those pairs {a, 8} € P 
such that B separates a, 8. If SC V(U) and P is a pair-set of V(U), the 
P-reduced degree d”(S) of S is d(S) — |Ps|. The P-reduced connectivity c’(£, 7) 
of two distinct vertices £, 7 is the minimum of the P-reduced degrees of the 
subsets of V(U) which separate them. An odd-vertex-pairing of U is a partition 
of the set of odd vertices of U into subsets of order 2; such a partition exists 
since, by (3, chapter II, Theorem 3), the number of odd vertices of U is even’. 
We shall show in §2 that, if P isan odd-vertex-pairing of U and £, 7 are distinct 
vertices of U, then c’(é, n) < c(&, 9)*. P will be called optimal if c”(é, n) = 
c(é, n)* for every pair &, 7 of distinct vertices of U. Our proof of Theorem | 
will depend on the following subsidiary result: 


THEOREM 2. Every unoriented graph has an optimal odd-vertex-pairing. 


2. Proof of Theorem 2. 
LemMMA 1. If a, 8, y are distinct elements of a set A and B C A, then 
tla, By} al + |LIB, vi}el > \tla, v}}al- 
(In accordance with our definitions, the notation {{@, ¢}}, means Ps, where 
P is the pair-set whose sole member is {8, $}.) 
The proof of Lemma | is left to the reader. 
Definition. Let A be a set, P be a pair-set of A and B, C be subsets of A. 


Then P(B, C) will denote the number of pairs {a, 8} © P such that one of 
a, 8 belongs to B and the other to C. 


Lemma 2. Let S, T be subsets of V(U) and P be a pair-set of V(U). Then 
(i) d(S) + d(T) = d(SAT) + dSAT) + 2\(SAT) o SAT); —— __ 
(ii) d?(S) + d(T) > $(d?(SAT) + d?(SOT) + d?SAT) + d?(SAT)) ; 


(iti) if Pr = {{0, d}}, where 0 © T and o © T, then 
d?(S) + d(T) > d?(SOT) + d?(8OT) — 1, 
and this inequality can only become an equality if 6 © SC\T, ¢ € SOT. 
Proof. Write 
SO\T = bits SOT = Z2, SAT = Z3, SAT = } am 
di;(= dy) = |Z, 0Z,|, Dis = Pus) = P(Zy, Z,)). 
Then (i) and (ii) are easily proved by expressing all terms on each side of (i) 
and (ii) in terms of the d,, and p,,—for example, 


‘This result also follows by putting S = V(U) in Lemma 3 of this paper. 








558 Cc. ST. J. A. NASH-WILLIAMS 


d(S) = dys + dig + dos + daa 
d”(S) = dis + dig + dos + doa — pis — Pra — Pos — Pu 


It can also be shown by this method that 
d?(S) + d(T) — d?(SOT) — d?(SOT) > |Pr| — 2P(SOT, SOT), 
which clearly implies (iii). 


Definitions. If SC V(U), o(S).will denote the number of odd elements of 
S. An odd-vertex-pairing P of U is S-optimal if c*(&, ») = c(é, 7)* for every 
pair £, » of distinct elements of S. We define c(S) to be 0 if S = A or V(U), 
and to be 


max c(é, 7) 
EeS.aneS 


otherwise. 
If m, n are integers, the statement ‘“‘m =n (mod 2)’ will be abbreviated 


Lemma 3. If SC V(U), d(S) =o(S). 


Proof. if = denotes the sum of the degrees of the elements of S, an edge 


contributes 2, 1 or 0 to = according as it belongs to S o S, Sé or § oS respec- 
tively. Therefore = = |S3| = d(S). But clearly = = o0(S). 


Coro.uary 3A. If P is an odd-vertex-pairing of U, d’(S) is even. 
Proof. Clearly |Ps| = 0(S); therefore, by Lemma 3, |P's| = d(S). 
Coro.Liary 3B. If , n are distinct vertices of U, c*(&, n) < c(é, n)*. 


Proof. Clearly c’(é, n) < c(é, 7). But c’(é, 7) is even, by Corollary 3A. 
Therefore c” (é, 7) < c(&, n)*. 


Coro.iary 3C. If Y CV(U), P is Y-optimal if and only if d?(S) > c(&, )* 
for every triple S, =,» such that SC V(U), § € SC\Y and 4 € SOVY. 
Proof. The given condition is equivalent to the assertion that, for every 


pair £, » of distinct elements of Y, c”(é, ») > c(é, »)*; and this inequality is 
equivalent to equality by Corollary 3B. 


CoroLuary 3D. P is optimal if and only if d’(S) > c(S)* for every subset 
S of V(U). 


Proof. Take Y = V(U) in Corollary 3C. 


Notational Conventions. When, to avoid ambiguity, it is necessary to 
specify the graph relative to which a graph-theoretical symbol is defined, the 
letter denoting the graph will be attached to the symbol in some convenient 
way. For example, if is a common vertex of two graphs G and H, dg(£) will 
denote the degree of — in G. We shall, however, make the convention that, when- 
ever two or more graphs are under consideration and one of them is denoted 





d 


J 


op @ 


a 


ted 


dge 
eC- 


ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 559 


by the letter U, all graph-theoretical symbols relate to U unless the contrary 
is specified—for example, d(), if otherwise ambiguous, means d,(£). 


Definitions. A subgraph of U is an unoriented graph H such that 
V(H) C V(U), E(A) C E(U) and each edge of H joins the same vertices in 
H as in U. If X is a subset of V(U), Ux will denote the unoriented graph 
defined by 

(i) V(Ux) = XU{X, E(Ux) = XoV(U), where X’ is a newly 
introduced vertex and is not an element of the set V(U) U E(U); 

(ii) each element of X o X joins the same vertices in Ux as in U; 

(iii) if & € X and d € {& oX, then A joins — and X’ in Ux. 

Thus Ux is obtained from U by contracting to a single vertex X’ the subgraph 
of U formed by the elements of X and those of X o X. 


LemMA 4. If ZC XC V(U) and P is an optimal odd-vertex-pairing of 
Ux, then d(Z) > |Pz| + c(Z)*. 


Proof. Let Uy = H. Since d(Z) = dg(Z) > |Pz| + cy(Z)* by Corollary 
3D, it suffices to prove that cg(Z) > c(Z). This will clearly follow if we show 
that 

(i) if &, » are distinct elements of X, then cy(é, 7) > c(é, 7), 

(ii) if ¢ € X, 7 € X, then cy(¢, X’) > cl, 7). 

Let W denote an arbitrary subset of V(H), and 7 denote whichever of 
W, V(H) — W does not include X’. Then (i) follows from the fact that, if W 
separates § and 9, dy(W) = d(T) > c(&, 9), and (ii) from the fact that, if W 
separates ¢ and X’, dy(W) = d(T) > c(g, 7). 


LEMMA 5. If S, TC V(U), then either c(S(\T)* > c(S)* = c(S)* or 
c(SC\ T)* > c(S)* = c(S)*. 


Proof. If S = A or V(U), the result is trivial. If not, select § € S,n € S 
such that c(t, 7) = c(S). If § € T,S(\T separates £, 7 and so 


c(S(\T) > clé, n) = c(S), 


whence c(S (\ T)* > c(S)*. Similarly, if — € T, then c(SC\ T)* > c(S)*. It 
is obvious that c(S)* = c(8)*. 


Lemma 6. Jf S, T C V(U), then either 


(1) c(SC\ T)* + c(8 A T)* > c(S)* + c(T)* 
or 
(2) c(SC\ T)* + c(8 A T)* > c(S)* + c(T)*. 


Proof. We may assume without loss of generality that c(S)* < c(T)*. 
Then, if c(S7\ T)* < c(S)*, we have c(S(\ T)* < c(T)* also; these two 
inequalities and Lemma 5 give c(S (\ T)* > c(S)* and c(8 (\ T)* > c(T)*, 
whence (2) is true. Similarly, if c(S 7\ T)* < c(S)*, then c(S8 (\ T)* < c(T)*; 














560 C. ST. J. A. NASH-WILLIAMS 


these inequalities and Lemma 5 give c(S A T)* > c(S)*, c(SC\ T)* > c(T)*, 
whence (1) is true. If, finally, c(S(\ T)* > c(S)* and c(8 (\ T)* > c(S)*, 
then, since either c(8 A T)* > c(T)* or c(SC\ T)* > c(T)* by Lemma 5, 
either (1) or (2) respectively is true. 


Definitions. Let §,4 € V(U). Then a subset X of V(U) is £y-critical if 
X separates £, 7 and d(X) = c(é, 7) =0. A subset of V(U) is critical if it is 
€n-critical for some pair £, 7 of vertices of U. 


LemMA 7. If X is a critical subset of V(U), then d(X) = c(X)*. 


Proof. Since X is critical, it is £omo-critical for some f& € X, mo € X. 
Therefore c(éo, m0.) = d(X). But c(é, 1) < d(X) for every — © X,n © X by 
the definition of c(é, 7). Therefore c(X) = d(X). Moreover, d(X) =0 since 


X is Eqno-critical. Since c(X) = d(X) =0, it follows that d(X) = c(X)*. 


Definitions. The order of U, denoted by ord U, is |V(U) U E(U)|. If 
h € E(U), U — will denote the subgraph of U defined by V(U — A) = V(U), 
E(U— x) = E(U) — {ar}. If € © VC(U), e(&) is defined to be 0 or 1 according 
as £ is even or odd respectively. A subset S of V(U) is vertical (in the sense of 
“pertaining to a vertex”) if either |S| = 1 or |S| = 1. For every subset S of 
V(U), S5 is called a cincture* of U. Two sets meet if they have at least one 
element in common. A subset S of V(U) divides a subset Y of V(U) if both 
Sand § meet Y. A subset of V(U) is Y-minimal if (i) it divides Y and (ii) its 
dezree is minimal amongst the degrees of those subsets of V(U) which divide 
Y. Asubset of V(U) is Y-critical if it is &n-critical for some pair £, 7 of elements 
of Y. A Y-critical cincture is a cincture of the form X6, where X is a Y-critical 
subset of V(U). 

We shall now suppose that U is a given unoriented graph, and make the 
inductive hypothesis that every unoriented graph of lower order than U has 
an optimal odd-vertex-pairing. By deducing that U has one also, we shall 
clearly establish Theorem 2. 


Lemma 8. If V(U) has a non-vertical critical subset, U has an optimal 
odd-vertex-pairing. 

Proof. Let X be a non-vertical critical subset of V(U), and let H = Ux, 
K = Ux. The definition of ‘“‘critical” implies that X and X are non-empty; 
hence, since X is non-vertical, |X| > 2 and |X| > 2. Therefore ord H < ord U 
and ord K < ord U. Therefore, by the inductive hypothesis, there exist 
optimal odd-vertex-pairings P, Q of H, K respectively. Since X is critical, 
d(X) is even. Therefore X’, X’ are even in H, K respectively. Moreover, each 
element of X, X has clearly the same degree in U as in H, K respectively. 
Therefore P U Q (=R, say) is an odd-vertex-pairing of U. We will show that 
R is optimal in U. 


‘This term is taken from (1). 








wl 





~~ TVS © 


— oe eS - 





ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 561 


If SC V(U), then, by Lemmas 2 (i) and 4, 
d(S) + d(X) > d(S X) + d(8N X) 
> IP sax| + c(S(\ X)* + lQsnxl a c(S 7) X)* 
= |Rs| + c(S1\ X)* + c(S XO X)*, 
whence 
(3) d®(S) + d(X) > c(S1\ X)* + c(8 1 X)*. 
if 7 C V(U), application of (3) for S = T and S = T gives 
d®(T) + d(X) > max (c(T 1\ X)* + c(T AO X)*, c(T OV X)*4+ c(TOR)*) 
> c(T)* + c(X)* 
by Lemma 6. But d(X) = c(X)* by Lemma 7. Therefore d*(7) > c(T)*. 
Hence, by Corollary 3D, R is optimal; and Lemma 8 is proved. 


LemMaA 9. JfX€ E(U), Y C V(U) and X belongs to no Y-critical cincture, 
then 


(4) Cu-a(é, n)* = c(é, n)* 


for each pair &, » of distinct elements of Y. 


Proof. Let &, be distinct elements of Y. It is clear that 
Cu-a(é, ») = c(é, 9) 


(which implies (4)) unless there is a subset X of V(U) such that X separates 
é and 7, A © Xé and 
(5) d(X) = c(&, 7). 


If there is such an X, then clearly cy_,(é, 9) = c(é, 7) — 1, so that (4) still 
holds if c(&, 7) is odd. But c(t, 7) cannot now be even, since this, together 
with (5), would imply that X was éy-critical and hence Y-critical, so that 
X6 would be a Y-critical cincture including X. 


LemMMA 10. Jf Y C V(U) and some edge X of U belongs to no Y-critical 
cincture, then U has a Y-optimal odd-vertex-pairing. 


Proof. Let a, 8 be the vertices joined by A. By the inductive hypothesis, 
we can select an optimal odd-vertex-pairing P of U — X. Since a, 8 each have 
different parities in U — \ and U, and every other vertex has the same parity 
in each graph, an odd-vertex-pairing R of U may be defined as follows: 


(i) if a, 8 are both odd in U, let R = PU fila, B}}; 


(ii) if a is even and B odd in U, let R = (P — {f{a, o}}) U {{B, of}, where 
g is the vertex paired with a by P; 
(iii) if a is odd and 8 even in U, let R = (P — {{8, r}}) U {fa, r}}, where 


rt is paired with 6 by P; 
(iv) if a, 8 are both even in U and fa, B} € P, let R = P — {{a, B}}; 
(v) if a, 8 are both even in U and {a, 8} ¢ P, let 
R = (P — {{a, o}, (8, r}}) U tle, zh}, 


where o, 7 are paired with a, 8 respectively by P. 











562 Cc. ST. J. A. NASH-WILLIAMS 


Lema 10A®. If SC V(U), |Rs| < |Ps| + |{{a, 8}}s\- 
Proof. in Cases (i) and (iv), the result is clear. In Case (ii), 
[Rs| = |Ps| — |{la, o}}s| + |B, o}}s| < |Ps| + |{la, 8}}s! 
by Lemma 1, and the discussion of Case (iii) is similar. In Case (v), 
IRs| = |Ps| — |{le, a}}s| — |{18, rH}s| + |ile, rH sl, 
which yields the required result since, by two applications of Lemma 1, 
lilo, of}.s| + life, Bh} sl + |{t8, rh}s| > |{fe, ris. 
if SC V(U),t € SC\ Vandy € SQ Y, then, since P is optimal in U = A, 
(6) cu-a(é, 0)* = co=a(, 0) < du“a(S) = du-a(S) — |Ps| - 
By Lemmas 9 and 10A and (6), 
c(t, n)* + |Rs| < cu-a(é, 2)* + |Ps| + |{ta, BY} s| 
< dy-a(S) + |{la, B}}s| = d(S). 
Hence, by Corollary 3C, R is Y-optimal; and Lemma 10 is proved. 
Lemma ll. Jf YC V(U) and YoY +A, then U has a Y-optimal odd- 
vertex-patring. 


Proof. Let \ € Y oY. If \ belongs to no Y-critical cincture, Lemma 10 
gives the required result; we may therefore assume that \ € X6 for some 
Y-critical subset X of V(U). If X were vertical, it would be of the form {w} 
or V(U) — {w} for some vertex w. But then \ would be incident with w since 
d € X6, and w would belong to Y since the Y-criticality of X requires X to 
separate two elements of Y; these conclusions contradict the assumption that 
 € YoY. Hence X must be non-vertical. Therefore, by Lemma 8, U has 
an optimal, and therefore Y-optimal, odd-vertex-pairing. 


LemMMA 12. Jf Y C V(U) and X ts a Y-minimal subset of V(U), then 
(i) c(é, 7) > d(X) for each pair t, » of distinct elements of Y; 
(ii) c(S) > d(X) for every subset S of V(U) which divides Y. 


Proof. Since X is Y-minimal, d(T) > d(X) for every subset 7 of V(U) 
which divides Y. This fact implies (i), and (i) implies (ii). 

LemMA 13. Jf YC V(U), YoY =A and V(U) has a non-vertical Y- 
minimal subset, then U has a Y-optimal odd-vertex-pairing. 

Proof. Let X bea non-vertical Y-minimal subset of V(U). Then X divides 
Y, so that we can select two vertices ¢ € X(\ Y,r€ XC\Y. Then 
c(o, rT) > d(X) by Lemma 12 (i), and c(e, r) < d(X) since X separates ¢, 7; 


5We give the names Lemma nA, Lemma mB to lemmas which themselves form part of the 
proof of Lemma n. 














ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 563 


hence c(o, r) = d(X). It follows that, if d(X) is even, then X is critical, so 
that U has an optimal and therefore Y-optimal odd-vertex-pairing by Lemma 
8. We shall therefore assume that d(X) is odd. 

Write Uy = H, Ux = K. Since X divides Y, x + A and X + A; therefore, 
since X is non-vertical, |X| > 2 and |X| > 2. Therefore ord H < ord U and 
ord K < ord U; hence H, K have, by the inductive hypothesis, optimal odd- 
vertex-pairings P, Q respectively. Since d(X) is odd, X’, X’ are odd vertices 
of H, K respectively; let X’, X’ be paired with 6, @ by P, Q respectively. Then, 
since each element of X, X has the same degree in U as in H, K respectively, 


R = (P — {{6, X}}) U (Q — to, X}}) U 118, 0) 

is an odd-vertex-pairing of U. 

Lemma 13A. [If either ZC X or ZC X, then d®(Z) > c(Z)*. 

Proof. This follows from Lemma 4 and the obvious fact that |Rz| = |Pz| 
or |Qz| if Z C X or X respectively. 

To prove that R is Y-optimal (which will establish Lemma 13), it suffices, 
by Corollary 3C, to prove 

Lemma 13B. If SC V(U),&€ SO\ Vandn € SC\Y, then 
(7) d*®(S) > c(é, n)*. 


Proof. We shall consider separately the cases (1) SX, SO X,8 0X, 
S\ X all meet Y; (II) SOX CY; IID SAXCY; IV 8ONX CY; 
(V) SO\X C FY. (It suffices that these cases are jointly exhaustive; that not 
all pairs of them are mutually exclusive does not matter.) 


Proof of (7) in Case 1. Let Z, be whichever of S(\ X, SO X includes ¢ 
and Zz. be the other. Let Z; be whichever of S (\ X, 8 (\ X includes » and 
Z, be the other. Then 
(8) c(Z1) > c(é, 9), c(Zs) > cE, ) 


since § € Z, C Z; and » € Z; C Z;. Moreover, since the Z, all meet Y by 
the hypothesis of Case I, they all divide Y; therefore 


(9) c(Z,) > d(X) (4 = 1, 2, 3, 4) 
by Lemma 12 (ii). Using Lemma 2 (ii), Lemma 13A and (8), and using 
(9) for i = 2, 4, we obtain 
ic l< , 
d*(S) + d(X) >5 DL a*(Z1) > 5 DL (Z0)* > clé, 0)* + a(X)*, 
i=1 -~ i=l 
which implies (7) since c(é, 7)* and (by Corollary 3A) d*(S) are even. 


Proof of (7) in Case ll. To prove (7) by reductio ad absurdum, let us suppose 
that (7) is false. Since c(t, 7)* and, by Corollary 3A, d*(S) are even, the 
falsity of (7) implies that 











564 Cc. ST. J. A. NASH-WILLIAMS 


(10) d®(S) < c(é,9)* — 2. 


Since $1 X C Y by the hypothesis of Case II and — € S/1 Y, it follows 
thaté € SO. X. But» € SA. YC V(U) — (SOX). Therefore c(S M X) > 
c(é, 7), and so, by Lemma 13A, 

(11) d®(S\ X) > c(é, n)*. 
Since X is Y-minimal, it divides Y; therefore, since S(VX CY by the 
hypothesis of Case II, S(\ X divides Y. Therefore c(S (\ X) > d(X) by 
Lemma 12 (ii), and so 
(12) d®(§ (\ X) > d(X)* 
by Lemma 13A. By (10), Lemma 2 (iii), (11) and (12), 
d(X) + c(é, n)* — 2 > d*(S) + d(X) 
(13) > d®(S(1\ X) + d*®*(S1\X) -— 1 
(14) > d(X)* + c(é, n)* — 1 
= d(X) + c(é, n)* — 2 
since we are assuming d(X) to be odd. Hence each inequality in the above 
sequence must in fact be an equality. Equality in (13) implies that 
15) @E SIX 
(and @ € S(\ X) by Lemma 2 (iii); and equality in (14) implies equality 
in (11) and (12), which, in the case of (12), gives 
(16) d®(§ (\ X) = d(X)*. 
Since 9€ SO\X, SO\X CY and YoY =A by (15), the hypothesis of 
Case II and a hypothesis of Lemma 13 respectively, it follows that 
(17) {A} o ((S7\ X) — {0}) = A. 
Since X, being Y-minimal, divides Y and @ € Y by (15) and the hypothesis 


of Case II, it follows that X — {6} divides Y. Therefore d(X — {6}) > d(X), 
since X is Y-minimal. But, by (15) and Lemma 2 (i), 


d((S (\ X) U {0}) + d(X — {0}) — (d(S AX) + d(X)) 
= 2|{0} o ((S\ X) — {6})I, 


which vanishes by (17). It follows from the last two sentences that 
(18) d((S (\ X) U {0}) < d(S AX). 

By (15) and the facts that {@, ¢} € Rand @ € X, 

(19) IReaxui| = |Rsax| + 1. 


Since X, being Y-minimal, divides Y,S (\ X C Y by the hypothesis of Case I 
and 6 € X, it follows that (S (\ X) LU {6} divides Y. Therefore, by Lemma 
12 (ii), 


EE , — 








Na 








ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 565 


(20) c((S (\ X) U {6}) > d(X). 
By subtracting (19) from (18), and using Lemma 13A and (20), 
d®(8 \ X) — 1 > d®((8M X) U {6}) > c((S MX) U {)* > d(X)*, 


which contradicts (16). This contradiction proves by reductio ad absurdum 
the truth of (7) in Case II. 

The truth of (7) in Cases III—-V can be proved by arguments similar to that 
given for Case II. (There is no real asymmetry between X and X in our dis- 
cussion, since a subset of V(U) is non-vertical [Y-minimal] if and only if its 
complement is non-vertical [Y-minimal].) We have therefore now completed 
the proof of Lemma 13B and hence also that of Lemma 13. 


LemMA 14. Jf YC V(U) and U has a Z-optimal odd-vertex-pairing for 
every proper subset Z of Y, then U has a Y-optimal odd-vertex-pairing. 

Proof. lf |¥| =0 or 1, any odd-vertex-pairing of U is vacuously Y- 
optimal; we may therefore assume that | Y| > 2. We may also, by Lemmas 11 
and 13, assume that 


21) YoY = A 


and that V(U) has no non-vertical Y-minimal subset. But V(U) has a Y- 
minimal subset since | Y| > 2; hence it must have a vertical one. This clearly 
implies that {w} is Y-minimal for some w € V(U). Since {w} is Y-minimal, it 
divides Y; therefore w € Y. Therefore, by the data of Lemma 14, U has a 
(Y — {w})-optimal odd-vertex-pairing P. 


Lema 14A. If SC V(U), SC\ Y = {o} and ¢ € 8, then d?(S) > c(w, )*. 


Proof. Let r€S—{w}, and let A, = {r} o(V(U) — {o,7}), B= 
{r} ofw}. Since SO) ¥ = {aw}, r € Y. But {w} divides Y since it is Y-minimal. 
Therefore {w, 7} divides Y. Therefore, since {w} is Y-minimal, d({w, r}) > 
d(w), which implies that |A,| > |B,|. Moreover, if this last inequality becomes 


an equality, r must be even, since d(r) = |A,| + |Br|. Therefore 

(22) |Ar| > |B,| + €(r). 

Furthermore, since S(\ Y = {w} and r € S — {w}, it follows from (21) that 
(23) A, = {r} oS. 

Since 


d(S) = |{w}o8] + YO |{r}o8|,dWw) = |{wjo8|+ Do |B.I, 
reS—\e} reS—\w 
it follows from (22) and (23) that 


(24) d(S) > d(w) + po e(r) = dw) + 0(S — {w}) = d(w)* + ofS). 


reS—i\@) 


But obviously o(S) > |Ps|; and, since {w} separates w,&, we have d(w) > 
c(w, £) and therefore d(w)* > c(w, £)*. Therefore, by (24), d”(S) > c(w, &)*. 











566 C. ST. J. A. NASH-WILLIAMS 


Let 6 be any element of Y — {wo}. If 
(25) SC V(U),w € S,0€ 8, 


then either S(\ Y = {w}, in which case d’(S) > c(w, 6)* by Lemma 14A, or 
S includes an element y of Y — {a}, in which case d’(S) >c?(y, 6) = 
c(y, 0)* since S separates ¥,@ and P is (Y — {w})-optimal, c(¥, 6) > d(w) 
by Lemma 12 (i) and d(w) > c(w, @) since {w} separates w, 6. Hence (25) 
implies that d”(S) > c(w, @)*. Therefore c”(w, 6) > c(w, 6)*, which must of 
course reduce to equality by Corollary 3B. Since this holds for any @ € Y — 
{w}, and since P is already (Y — {w})-optimal, it follows that P is Y-optimal. 
Lemma 14 is therefore proved. 

Using induction on | Y|, we infer from Lemma 14 that U has an optimal 
odd-vertex-pairing. This, in turn, completes the inductive step in our proof 
of Theorem 2 by induction on the order of the graph. 


3. Proof of Theorem 1. U is Eulerian if its vertices are all even. N is 
quasi-symmetrical if x({&}) = e(fé}) for every —€€ V(V). If S,TC VN), 
S—T will denote the number of edges A of N such that Xt € S, Ah tf 
If His a subgraph of U, an orientation L of U will be said to induce the orienta- 
tion M of H such that ty = At, and Ahy = Ah, for every \ € E(A). 


LemMMA 15. If N ts quasi-symmetrical and S C V(N), then x(S) = 3d(S). 
Proof. Since N is quasi-symmetrical, we have 


S— V(N) = y x({t}) = Do e({€}) = VIN) S. 
Subtracting S — S from each side gives S ~ § = 8 — S, which clearly implies 
the required result. 

Given any unoriented graph U, the following argument now shows that it 
has an admissible orientation. By Theorem 2, we can select an optimal odd- 
vertex-pairing P of U. Construct an unoriented graph H such that (i) V(H) = 
V(U), (ii) U is a subgraph of H, and (iii) two distinct vertices £, 7 are joined 
in H by exactly one element of E(H) — E(U) if {&, 9} € P and by no such 
element otherwise. Since P is an odd-vertex-pairing of U, H is Eulerian and 
therefore (3, p. 30, Il. 4-9) has a quasi-symmetrical orientation Q. Let V, M 
be the induced orientations of U, F respectively, where F is the subgraph of H 
defined by V(F) = V(A) (= V(U)), E(F) = E(A) — ECU). Uf SC VU), 
then x9(S) = $dqg(S) by Lemma 15 and x y(S) < dr(S) obviously. Therefore 


xy (S) = x q(S) —_ X a(S) > $d (S) —_ dr(S) = 4(d(S) —_ dr(S)) = kd? (S). 


Hence, if £, are distinct vertices of U and S runs through all subsets of 
V(U) which include ~ but not 7, we have 


ay(t, 7) = min xy(S) > min 3d*(S) = AeP (Eg, n) = 4c(&, n) , 


since P is optimal. Since $c(é, 7)* = [4c(&, )], this proves that V is admissible. 
I 2 2 n I 








ORIENTATIONS, CONNECTIVITY, ODD-VERTEX-PAIRINGS 567 


| am grateful to the referee for some improvements in the presentation of 
this paper. 


REFERENCES 


1. R. Cantoni, Conseguenze dell’ ipotesi del circuito totale pari per le reti con vertici tripli, 
R. C. Ist. lombardo, Classe di Scienze Matematiche e Naturali (3), 14 (83) (1950), 
371-387. 

2. L. Egyed, Ueber die wohlgerichteten unendlichen Graphen, Math. phys. Lapok, 48 (1941), 
505-509 (Hungarian with German summary). 

3. D. Konig, Theorie der endlichen und unendlichen Graphen (Leipzig, 1936; reprinted New 
York, 1950). 

4. H. E. Robbins, A theorem on graphs, with an application to a problem of traffic control, Amer. 
Math. Monthly, 46 (1939), 281-3. 


University of Aberdeen 











ON THE PROJECTIVE CENTRES OF CONVEX CURVES 
PAUL KELLY anp E. G. STRAUS 


1. Introduction. We consider a closed curve C in the projective plane 
and the projective involutions which map C into itseif. Any such mapping y, 
other than the identity, is a harmonic homology whose axis 7 we call a pro- 
jective axis of C and whose centre » we call an interior or exterior projective 
centre according as it is inside or outside C.' The involutions are the generators 
of a group I, and the set of centres and the set of axes are invariant under I. 
The present paper is concerned with the type of centre sets which can exist 
and with the relationship between the nature of C and its centre set. 

If C is a conic, then every point which is not on C is a projective centre. 
Conversely, it was shown by Kojima (4) that if C has a chord of interior 
centres, or a full line of exterior centres, then C is a conic. Kojima’s results, 
and in fact all the problems considered here, have interpretations in Hilbert 
geometries. If C is convex and x and y are points interior to C, then the line 
n =x Xy cuts C in points @ and }b, and the Hilbert distance defined by 
h(x, y) = |log R(a, b; x, y)| induces a metric on the interior of C. An involution 
@ leaving C invariant preserves this distance and hence is a motion of the 
Hilbert plane onto itself. If the centre p of ¢ is inside C, then the motion is a 
reflection in the point p. If p is outside C, then the motion is a reflection in 
the Hilbert line carried by the axis 7. When C is a conic, then the Hilbert 
geometry is the Klein representation of hyperbolic geometry. Thus Kojima’s 
results imply that a Hilbert geometry is hyperbolic if it possesses reflections 
in every point of a line, or reflections in every line of a pencil. 

In §2 we determine the convex closed curves which admit a continuous 
group of projective transformations and hence the plane Hilbert metrics with 
a continuous group of isometries. In §3 we apply these results in order to 
sharpen and extend Kojima’s characterizations. In §4 we consider curves with 
discontinuous transformation groups generated by projective involutions. 
Finally, in §5, we extend our results to higher dimensional projective spaces. 


2. Curves which are invariant under a continuous group of pro- 
jective transformations. In this section we wish to characterize the closed 
convex curves in the projective plane which permit infinitesimal projective 


Received July 7, 1959. 

'These terms are justified by the fact that each interior projective centre is an affine centre 
if the corresponding axis is taken as the line at infinity, while the axis corresponding to an 
exterior centre becomes an affine axis (the locus of midpoints of parallel chords) if the exterior 
centre is at infinity. For an application of this concept see (2). 


568 








(i 


wl 


af 


or 


Ww 





we 


w 


or 








PROJECTIVE CENTRES OF CONVEX CURVES 569 


transformations onto themselves, and hence the Hilbert geometries which 
permit connected continuous groups of isometries. 

For this purpose we first determine the orbits of points under one-dimensional 
continuous subgroups of the projective group. Every such subgroup can be 
represented by G = {exp (tA)| — © <t<@} where A is a3 X 3 matrix. 

By suitable choice of co-ordinates we can reduce A to one of the following 
forms (over the complex field) 


a 0 0 e' 0 0) 
(i) Az=i{O0 b 0], exp (44) = | 0 g" 0 
0 0 c 0 0 e*' 
a+ib 0 0 ee” O 0 
(ii) A =|0 a—ib 0], exp (tA) = | 0 oc © 
0 0 € 0 0 ay 
a l 0 ” at te*' 0 
(iii) A =|0 a 0}, exp (tA) = | 0 e' 0 
0 0 c 0 0 e*' 
a t° a 
a ] 0 ° a os 5 ; 
(iv) A= : : 1], exp (tA) = 0 et tet! 
. 00 
where a, bd, c are real. 
Case (i). The orbit of (xo, yo) is x = xo e°', y = yee?! which is 


affine equivalent to one of the convex arcs 
y= x", 0x 0,-—-lgomcl 

or a single point. 

Case (ii). A suitable complex affine transformation yields the real orbit 
x = e-9 Ixqcos bt — yo sin bt], y = e-°' [xo sin bt + yo cos dt]. This is the 
affine equivalent of a circle if a = c and of a spiral r = e* if a # c. The 
degenerate orbit is a single point. 

Case (iii). The orbit is 

x= e—9 * (x9 + tyo), y = eft-oF Yo. 

This is affine equivalent to 


x=ylogy, 0<y< @; 


with the full line, the half-line and the point as degenerate orbits. 
Case (iv). The orbit is 


9 


t 
x=xXo tints, y=Vyrtt 


» 


which is the affine equivalent of the parabola x = y’. 











570 PAUL KELLY AND E. G. STRAUS 


A closed curve which admits a one-dimensional group of projective trans- 
formations must consist entirely of orbits of its points under that group. We 
thus obtain the following. 


THEOREM 1. The convex closed curves in the projective plane which are 
invariant under a connected continuous group of projective transformations are 
projective equivalents of one of the following: 

1. Two straight lines. 

2. A curve consisting of the two arcs y = x", x > 0 and y = — ax", x > 0, 
a > 0 and their common endpoints, where 0 < m <1. (For m = 3 this case 
includes the differentiable union of two conical arcs; and, in particular, the 
conic.) 

3. A curve consisting of the arc y = x™,x <0 (0 > m < 1), the line segment 
x = 0, y < 0; and the segment on the line at infinity which corresponds to non- 
positive slopes. (For m = 1 this includes the triangle.) 

4. A curve consisting of the arc y = x log x, x > 0 and the segment x = 0, 
0O<y< @. 


3. Curves whose projective centres have limit points which are not 
on the curve. If the projective centres {p,} of C have a limit point p¢C 
then the corresponding projective involutions {7;} of C have a limit involution 
y whose centre is ». Hence C admits the projective transformations {7,7} 
which approach the identity. In other words C admits a connected continuous 
group of projective transformations. 


THEOREM 2. [If the projective centres of a closed convex curve C have a limit 
point not on C then one of the following cases holds: 

1. Cisaconic and all points not on C are projective centres of C. (This includes 
two lines as a degenerate case which admits all points of the plane as projective 
centres.) 

2. C is the union of arcs of two different conics with common endpoints which 
are points of differentiability of C (This includes the case in which one of these 
arcs degenerates to two tangent line segments from a point to the other arc). Here 
the projective centres consist of the points exterior to C on the line of the common 
chord of the two conical arcs. 

3. C is the union of a conical arc and the chord joining its endpoints (this 
includes the triangle for a degernerate conical arc). Here the projective centres 
consist of the points on the line of the chord exterior to the chord. In the case of the 
triangle all points on the three lines which are not on the three sides are projective 
centres. 


In order to prove this theorem we first establish the following extension of 
Kojima’s results. 


LEMMA 1. Let a and b be two points on the (convex) closed curve C so that all 


——— 





ar 


co 











) 
s 
4 
é 


f 








PROJECTIVE CENTRES OF CONVEX CURVES 571 


points on the line a X b which are exterior to C are projective centres of C. Then 
either 


(i) C is differentiable at a and b and the two arcs of C with endpoints a, b 
are conical arcs (one of which may be degenerate), or 

(ii) the segment a,b is on C and the other arc of C with endpoints a, b is 
conical (possibly degenerate). 


Proof. For any exterior point x on o = a X b the involution y, inter- 
changes a and 8, so both are regular points or both are corner points. If they 
are corner points then the two one-sided tangents at a and the two at 6 cannot 
be four distinct lines for then they would determine a quadrilateral, two of 
whose vertices would have to be invariant under every y,, which would imply 
that all ¢, had a common axis. Thus, in this case, one of the one-sided tangents 
at a coincides with one of the one-sided tangents at 6. In other words,? the 
segment (a, b) lies on C. Next, let w be a one-sided tangent at an arbitrary 
point p of C, p ¢ (a, b). If x = o X w is exterior to C then y, leaves p and w 
fixed. Because y, interchanges two arcs at p, it also interchanges the half 
tangents at p, so both must coincide with w. The only case in which ¢ X w 
is not exterior to C is the case in which it is one of the points a, 6 and therefore 
one arc of C with endpoints a, 6 consists of the line segments (a, p), (0, p). 

Thus we have the following alternatives. Either C is everywhere different- 
iable; or C has a single corner point p (# a, 5) in which case it contains the 
line segments (p,a) and (p, 6), or C has two corner points, in which case 
these corner points are the points a, ’ and C contains the segment (a, 5)? 
or C has three corner points, in which case it is a triangle. 

In case C is not a triangle, let y be a point on a differentiable arc C,, of C 
with endpoints a, 6 and C,, # (a, 6). There exists a unique conic K, which 
passes through y and is tangent to w, at a and w, at b, where w,, w) are the 
one-sided tangents to C,, at its endpoints. Let x = o X w, where w, is the 
tangent to C at y, then y, leaves y, C and K, invariant and hence w, is tangent 
to K,. Now the family of conics tangent to w, and w, at a and 6 has no proper 
envelope. Hence C,,, must be an arc of one of the conics of that family. 


Proof of Theorem 2. Since our hypothesis implies that C admits a contin- 
ous group of projective transformations we need only consider the cases 
enumerated in Theorem 1: 


Case 1. Obvious. 


Case 2. If m # } then the origin and the point at infinity are distinguished 
by the fact that they either are not points of analyticity or that they are points 
of zero curvature. That is to say, any involution must permute the origin and 
the point at infinity between themselves. Thus all interior centres are on the 


*If Cis not assumed convex, the points a,b may be cusps formed by arcs with common one- 
sided tangents at a, b. 








ole PAUL KELLY AND E. G. STRAUS 


(positive) x-axis; and the exterior centres are either on the (negative) x-axis 
or the point at infinity of the y-axis. 

Now the curve permits the affine transformations x — tx, y > f"y, t > 0. 
Thus if (xo, 0) is a projective centre then so is (txo, 0) for all t > 0. If x» > 0, 
then, by Kojima’s theorem, C would be a conic and if x» < 0 then, by Lemma 
1, C would consist of conical arcs. 

The point at infinity on the y-axis is a projective centre if and only if the 
x-axis is a Euclidean axis of symmetry. In other words, if and only if a = 1. 
To sum up: If a curve in this case is not the union of two conical arcs, then it 
has at most one (exterior) projective centre. If m = 4 and a = 1, then C 
is a conic. If m = 4 and a # 1, then C is the union of two conical arcs and 
all points on the negative x-axis are projective centres of C. 

Case 3. If m = 1 and C is a triangle then the situation is obvious. If 
m ~ 1 then every projective mapping of C onto itself must permute the 
origin and the point at infinity on the x-axis among themselves and must leave 
the point at infinity on the y-axis fixed. Thus all projective centres must be 
exterior (since an involution of C which corresponds to an interior centre 
can have no fixed points) and lie on the negative x-axis. As in Case 2 we see 
that if there is one centre then all points on the negative x-axis are centres 
and by Lemma 1 we have m = 3 so that C is the union of a conical arc and 
a degenerate conical arc which is differentiable at the common endpoints. 

Case 4. Any involution must preserve the straight line segment on this 
curve. Thus there can be no interior projective centre and an exterior pro- 
jective centre would have to lie on the (negative) y-axis. 

Now the curve admits the affine transformations x — tx, y — ty + (tlog t)x, 
t > 0; so that if (0, yo) is a projective centre then so is (0, tye) for all ¢ > 0. 
In other words, if there is a projective centre then all points on a supporting 
line which are not on the curve are projective centres. By Lemma 1 this 
would imply that the arc y = x log x, x > 0 is a conical arc. 

To sum up: A curve of Case 4 has no projective centres. 

It is easy to see that the only non-convex closed curves whose projective 
centres have a limit point not on C are the unions of two conical arcs with 
common one-sided tangents at their juncture. 


4. Projective centres of general convex plane curves. 


THEOREM 3. If there are two centres po and p, interior to C, then there is 
an infinite sequence {p,}, n= 0,41,+2,..., of interior centres on the 
line po X pi. The points of intersection of C and po X p; are the two limit points 
of the sequence, 

p-. = lim p, 


and 
b.. = lim pp, 


Nyx 











wm VWs 


iS 
le 
ts 











PROJECTIVE CENTRES OF CONVEX CURVES 573 


and C is differentiable at these points. If C is not a conic then its curvature has 
a singularity of the second kind at p_., and p.,. 


Proof. Let the involutions corresponding to po and p; be yo and 7. Then 
they generate the centre sequence {p,} and the corresponding sequence of 


involutions {y,}; 2 = 0, +1, + 2,..., defined recursively by 
i = Ya1Ya- ate} n= 2, 3, 4, es Te = Yorrtarttaei} sa = 1, a aot 
Pa = Pn—-2Yn—-1 Pn = Pn+2¥n+1 / 


Because the involutions y; are motions of the Hilbert plane defined by C, 
the Hilbert distance between any two successive centres in the p; or in the 
p-_, subsequence is the same. Thus, in the Hilbert sense, the p,; and p_, se- 
quences correspond to the points obtained by starting with po and ; and 
then repeatedly stepping off the distance h(po, p:) in the two directions along 
the line respectively. Hence one sequence converges (in the topology of the 
projective plane) to a and the other to 6, and these points are in some order 
the points p_,, and p,,. 

From the collinearity of the centres ,, it follows that the axes n, belong to a 
pencil together with their limit lines 7_,, and »,,, which are lines of support to 
C at p_,, and p,, respectively. Each conic which is tangent to 7_,, and 7,, at 
p_,.. and p,, respectively in an invariant of all y,. Let F be the family of these 
conics. Then for each g on C, g ¥ a, b, the sequence of points g, = q@, lies 
on the (unique) conic K,, which is in F and passes through q, and the sequence 
has p_,, and p, for its only limit points. The arc A of C, with ends g and 
gycy: and which does not contain p,, determines C completely. Let K, be 
the maximal conic of F whose interior does not intersect A, and let Kz be 
the minimal conic of F which contains A. Then C lies entirely exterior to 
K, and interior to K». Since K, and Kz have common tangents at p_,, amd ,,, 
these tangents must also be tangents to C. 

Finally, if C is not a conic then K, # K2 and C intersects every conic of F 
between K, and K-, infinitely often in every neighbourhood of p_., and of 
p... Thus, in every such neighbourhood the curvature of C oscillates between 
that of K, and that of Ko. 


In a completely analogous manner we can prove the following. 


THEOREM 4. If there are two exterior centres po, p; of C so that po X py is 
a secant of C, then there 1s an infinite sequence \p,} n= 0, +1,42,..., 


of exterior centres on po X pi. The points of intersection of C and po X p, are 
the two limit points of the sequence, 


pb. = lim pp 
and 
pP. = lim pp 


n—co 











574 PAUL KELLY AND E. G. STRAUS 


and C is differentiable at these points. If the arc of C om either side of po X p 
is not conic (possibly degenerate) then its curvature has a singularity of the 


2nd kind at p_., and p,,. 


For the sake of completeness we prove the following theorem which is 
well known, though possibly not in this formulation (1, p. 190). 


THEOREM 5. If the set of centres of C is finite, then the number of centres is 
odd. There is at most one interior centre and the exterior centres are collinear on 
a line which does not intersect C. In other words, the finite subgroups, of the plane 
projective group, which are generated by involutions are isomorphic to the groups 
of symmetry of the regular polygons. 


Proof. If the number of centres is finite, then by Theorems 3 and 4 there 
cannot be two interior centres nor two exterior centres whose line intersects C 
or is a line of support to C. In any case, there cannot be exactly two centres, 
for then each would have to lie on the axis of the other. But that, in turn, is 
the condition for the product of their involutions to be an involution, or, what 
is the same thing, for the two axes to intersect in a third centre. 

Next suppose that p; and 2 are the only exterior centres on the line 
Pi X pe. This line cannot intersect C, so the axes 7; and 72 intersect at an 
interior centre po. Any third exterior centre p; is not on ~; X p2 and hence 
its axis 93; does not pass through fo. Then 7; carries p» to a second interior 
centre, contrary to hypothesis. 

We have thus proved that there is exactly one (interior or exterior) centre, 
or there are exactly one interior and two exterior centres, or every line through 
two exterior centres also passes through a third exterior centre. But the last 
condition applied to a finite set (here the exterior centres) is known to imply 
that the set is collinear. 

To see that the number of centres is odd, we consider first the case in which 
an interior centre po exists, and 1, po,..., P;; 7 > 3, are the exterior centres 
which obviously lie on mo. Then pf» lies on m, i = 1,2,...,7. Since p; is on 
no and Po is on m, the intersection m9 X m, is an exterior centre, say p2. Under 
71, the points p; and 2 are fixed and the remaining j — 2 exterior centres are 
interchanged in pairs, hence 7 — 2 is even and the total number of centres 
j +1 is odd. 

Next, suppose there is no interior centre. Then the point in which 7; inter- 
sects the line of exterior centres cannot be itself a centre. Under y; the point 
p: is fixed and the remaining j — 1 centres are interchanged in pairs, hence 
j — 1 is even and [ is odd. 

The centre sets thus described are, up to projective transformations, the 
centre sets of the regular polygons. 

A subset of a centre set is independent if, in the corresponding subset of 
involutions, no one of the transformations can be generated by the remaining 
involutions. We now consider the extent to which independent reflections can 
exist in non-hyperbolic Hilbert geometries. 








a ean 





a 
e 


is 





PROJECTIVE CENTRES OF CONVEX CURVES 575 


THEOREM 6. There exist closed convex curves which are not conics and which 
have infinitely many independent projective centres. In fact, for every closed convex 
curve S, there exists a closed convex curve C with infinitely many independent 
projective centres and which coincides with S except for a portion of arbitrarily 
small linear measure. 


Proof. First, to illustrate our method, we give an example of such a curve. 
On the x-axis, we pick a sequence of disjoint intervals [u;,v,],i = 1,2,..., 
so that uw; <0 <u, < v2 <.... On the parabola y = x’, let the points U, 
and V, have the co-ordinates (u,, 4,2) and (v;, v,*) respectively. Let y; be an 
involution of the parabola which interchanges the inside and the outside of 
the arc (U;, V,), and let [ be the group which such involutions generate. The 
centre of y; may be any interior point of the chord U,V, of the parabola, or 
it may be the intersection point of the tangents to the parabola at U,; and V,. 

We now construct a curve C to consist of two parts C’ and C”. The first 
part C’ is the union of arbitrary convex arcs C, with end points V,,, U,, 
os 


Vo = lim V,, 
ta 
subject to the restriction that if the arcs (V,_:, U,) of the parabola are re- 
placed by the arcs C,, then the resulting curve is still closed and convex in 
the projective plane (an especially simple example is that in which C;, is 
chosen as the straight line segment from V,_; to U,). The remainder of C, 
namely C”, is defined by the images of C’ under I. 

To show that C is a connected convex curve, we project the parabola onto 
the x-axis fiom the point at infinity on the y-axis. To the projectivity y in 
r there then corresponds a one-dimensional projectivity y’ of the x-axis onto 
itself. The mappings y’ form a group I’ which is generated by y;/,i = 1,2.... 
It is easily verified that 7,’ is either an inversion in a one-dimensional circle, 
or is such an inversion followed by a reflection on a point. In either case, y,’ 
interchanges the interior and exterior of the interval [u,, v;). 

Let R denote the points outside all the intervals [u,;, v,|. If x is interior 
to R, then the image of x under any mapping in I’, other than the identity, 
lies inside one of the open intervals (u,,v,), hence I’ is a properly discon- 
tinuous group. The set R is a fundamental region. From the theory of dis- 
continuous groups it follows that the images of R under I” fill the intervals 
[u;,v,;] without gaps or overlap. Since R is the projection of C’ and the T 
images of C’ project to the I’ images of R, it follows that C is a connected 
curve. The convexity of the arcs of C’ is preserved by I. Since the parabola 
remains convex when its arcs (V;_;, U;) are replaced by those of C’, it will 
still remain convex when the arcs (V;; 7, Uvy) are replaced by those of 
C’y, for any y in IT. Repeating this argument, we find that the curve 
C(y™, y™, ..., ¥™), which consists of C’ UC’ y® UU... C’y™ on the 











576 PAUL KELLY AND E. G. STRAUS 


arcs (Viuy”, U™), and of the parabola elsewhere, is convex. Thus it 
follows that the curve 


C = lim C(y"”, y,..., ¥”) 
n-sc0 
is convex. But if T = (7, y™,...,7™,... .), then this curve is precisely 
C=#2CVUc”. 

We now proceed with the proof of the general case. Let S be a closed convex 
curve which is not a polygon. Then there exists an infinite sequence of disjoint 
arcs {(U;, V,)} of S which are either not straight line segments or there are 
lines of support to S at U,; and V,; whose union does not contain the arc 
(U;, V;) and so that the total length of the arcs (U,, V,) is arbitrarily small. 
If S is a polygon then we first replace it by a curve S which is not a polygon 
and coincides with S except on arcs of arbitrarily small total length. 

To each (U;, V;) we associate a projective involution y,; such that y, either 
has the line o, = U; X V; as its axis and its centre is \; & u;, where A; and 
u,; are lines of support to S at U; and V, respectively, or else y, has its centre 
interior to S on the interval (U;, V;) and interchanges \; and y;. The triangle 
A; formed by o;, Ay, u:;, which contains the arc (U;, V,), has the property 
that A, /\ A, is empty, for i + j7. The y,; are therefore independent, since y, 
maps a point of S not in the union of the arcs (U;, V,;) into A,. Finally, let 
lr be the group generated by the mappings 7;. 

We now construct the curve C by successive steps. Let S, be the curve 
obtained from S by replacing each arc (U;, V,;) by the image under ¢, of the 
complementary arc (U;, V;)’ of S. We can now repeat this construction 
starting with S,; and the arcs {(U;, V;)y;!, i,j = 1,2,..., with the corre- 
sponding involutions y;y:7; leading to a curve S», etc. All the curves S, are 
convex, and they agree at more and more points so that the length of the 
complement of S, (\ S,i: in S,4; converges to zero. Thus 


C = lim S, 


Nio@ 


exists and is a convex closed curve. Since S, (\ S,a1 CG (S,-1 O’ S,)y;, and the 


lim S, 1) Sp+1 


Nex 


is dense in C, we have 


Cc - Cy (lim 5S, C) Sni1) = C,{lim (S,—1 a) Sn)¥ i] = Cy: 


ee M— co 


so C has the desired involutions. 


5. Projective involutions in higher dimensional projective spaces. 
The projective involutions of (real) n-dimensional projective space P" need 
not have “‘centres.”” Since the eigenvalues are all +1 the dimensions of the 
characteristic spaces are determined by the signature. If that signature is 











m 





— 


Y 











PROJECTIVE CENTRES OF CONVEX CURVES 577 


s = n+ 1 — 2k then the characteristic spaces are k — 1 and n — k dimen- 
sional, leading to “projective (k — 1)-planes of symmetry.”’ Incases = + (n—1) 
we again have a projective centre and a projective hyperplane of symmetry. 

While it would not be difficult to obtain a characterization of the orbits of 
points under one-dimensional groups of projective transformations, there is 
no hope of obtaining a simple characterization of the convex closed surfaces 
which admit a continuous group of projective transformations. So, if we wish 
to obtain results analogous to those of § 2 we have to proceed somewhat 
differently. 

Even the concept of a convex surface in P" needs some elaboration which 
did not arise in P*®. In P? a closed curve C which separates the space into 
two open regions which are segmentwise connected is the projective equiva- 
lent of an affine convex curve (if we include two parallel lines in this des- 
cription). Thus we may use as our definition of projective convexity either the 
property 

(i) C is simple closed so that the components of its complement are seg- 
mentwise connected, or 

(ii) C is projectively equivalent to a convex closed curve. 

Clearly (ii) implies (i). 

In P* these two definitions no longer coincide and Kneser (3) has shown 
that the only surface which satisfies (i) but not (ii) is a quadric surface. For 
higher dimensions the situation is not known. We shall use Kneser’s definition 
(i) when we speak of a closed convex surface. 


THEOREM 7. A closed convex surface S in P" has a projective centre p © S if 
and only if either 

(i) S is a convex cone with vertex p; or 

(ii) S consists of two hyperplanes and p is any point on S. 


Proof. We first prove the theorem for m = 2. If we take the axis of p to 
be the line at infinity then there is an arc of S which contains p and has p 
as affine centre. Since a convex arc can be symmetric about one of its points 
only if it is a straight line segment, we are left with only two possible cases. 
Either S consists of two straight lines through p, or of one straight line through 
b and the line at infinity. 

For general » we now have that every two-plane through p intersects S 
either in p alone or in a single straight line or in two straight lines or lies 
entirely in S. The 2-planes which contain points on both sides of S, therefore 
always intersect it in two straight lines. Either both of these lines go through 
pb, or one goes through p and the other lies in the hyperplane of symmetry, z, 
corresponding to p. For reasons of continuity we see that only one of these 
possibilities occurs. In the first case S is a cone with vertex p. In the second 
case S contains the entire plane r and, since the only convex closed surface 
which contains a hyperplane consists of two hyperplanes, it follows that S 
consists of and a hyperplane through p. 











578 PAUL KELLY AND E. G. STRAUS 


THEOREM 8. The projective centres of a convex closed surface S which lie in 
P" — S form the union of a discrete, relatively closed set of linear manifolds 
(that is, all the points of a k-plane, 0 < k < n, which lie on one side of S). 

If the projective centres have a limit point p in P" — S then all centres which 
do not belong to the k-dimensional component C,* of p must lie either in that 
part of the plane P* of C," which lies on the other side of S than p, or in the 
(n — k — 1)-dimensional plane of intersection of the hyperplanes of symmetry 
which correspond to the centres of C,*. If there are any centres in P* — C,* —S 
then all points of P* — S are centres. 

Proof. If the projective centres are discrete in P" — S then the theorem 
is true. Assume now that there is a point p € S which is the limit point of a 
sequence of projective centres {p,}. 

We first prove relative closure by showing that ? itself is a projective centre. 
To any point g not on S we can associate the locus 2, as follows. Let \ be any 
line through g which intersects S in the points a, 6 and let gq, be the harmonic 
conjugate of g with respect to a, 6. Then 2, is the locus of all q. 

Now gq is a projective centre if and only if =, is planar. Since &, is clearly 
a continuous function of gq for all ¢g ¢S it follows that 

=, = lim &,, 
is planar. 

Let y; and y be the involutions with centres p; and p respectively. The 
centres of the central involutions in {y;, y} all lie on the line p X p;, and 
are the images of p and p, under the transformations {yy;}. Since yy; — 1, 
the identity map, as p;— p we see that the projective centres cover the 
line p X p; “‘more and more densely” in a neighbourhood of p as p; — p, 
and that therefore p is an interior point of a segment of centres on any limit 
line of the lines p X p;. The endpoints of a segment of centres on such a 
limit line are themselves limit centres and therefore interior to a segment 
of centres on the line unless they lie on S. Thus a limit line \ of the lines 
b X p;, consists entirely of projective centres if \ does not intersect S; if \ 
meets S in a single point then all other points of \ are projective centres; if 
\ meets S in two points then all points in the open component yu of A — S 
which contains p are centres. The other component y’ (if any) of \ — S may 
contain no projective centre, but if it contains one centre p’ then p’ can be 
invariant under at most one of the involutions defined by the centres in u. 
Hence >’ is itself a limit point of centres and all points of A except 4 /\S 
are projective centres. 

Now let p’ ¢X be a projective centre in P* — S. The discussion in the 
proof of Theorem 2 shows that the orbit 0, of »’ under the group generated 
by the centres of \ is either a full conic, or a conic with one point removed, 
or an open conical arc with endpoints a,b (€ 4 /\S), or a conic with the 
points a, b removed, or just »’. The last possibility occurs only if ’ lies in 
the intersection of the planes of symmetry of all the centres on X. In all other 


a 








wi <= eo — 





PROJECTIVE CENTRES OF CONVEX CURVES 579 


cases the segments of the tangent lines of 0,- which lie on the same side of S 
as 0, consist of centres. Thus the set of centres has interior points in the 
two-plane = p’ X \ and all points of x which are on the same side of S as 
p’ are centres. If p and p’ are on different sides of S then all points of  — S 
are projective centres. 


This completes the proof of our theorem. 


It is easy to see that in case the centres have a limit point in P* — S most 
of the cases described in Theorem 8 can occur. However there are the following 
exceptions. 


THEOREM 9. The closed surface S is quadric in each of the following cases. 

1. The projective centres fill one component of p" — S. 

2. The projective centres fill a hyperplane p"“ — S. 

3. The projective centres in p" — S have two non-coplanar components with 
sum of dimensions n — 1. 


Proof. By induction on n, if m = 2 then Cases 1 and 2 follow immediately 
from Theorem 2. In Case 3 the two components must be one- and zero- 
dimensional respectively. In other words there is a line segment (a, }) of 
centres and a centre p not on a X b. By Theorem 8 we know that p is the 
point of intersection of the axes of symmetry corresponding to the centres of 
(a, b) and therefore a X 6 is the axis of p». By Theorem 2, the curve S consists 
either of two conical arcs with endpoints a, 6 and differentiable at a, 6, or of 
one conical arc and the segment (a, 6)’ complementary to (a, 6). Since S has 
the line a X 6 as axis of symmetry the second case is excluded and the first 
case possible only if S is a conic. 

Now consider any hyperplane p*~' which intersects S in a convex closed 
surface S’. The projective centres which lie in p*~' (and hence are centres 
of S’) fill one of the components of p”-' — S’ in Case 1, and in Case 2 they 
fill an (m — 2)-plane p*-? — S’. Thus in these cases every plane section of 
S is quadric and hence S is quadric. 

In order to prove Case 3 we first need a lemma. 


LemMa 2. If the centres of S have a k-dimensional component lying in the k-plane 
P*, then every (k + 1)-plane P**' through P* intersects S in a surface S', so that 
each component of S’ — P* is quadric (that is lies in a quadric surface). If S’ — P* 
has more than one component, then S’ is differentiable on S’ (\ P*. 


Proof. For n = 2 the case k = 0 is trivial (as it is for all m) and the case 
k = | was treated in Theorem 2. We now see by induction that the inter- 
section of each component of S’ — P* with any k-plane Q* C P**' is quadric 
and hence each component is quadric. If S’ — P* has more than one component 
then S’ ()\ Q is differentiable on S’ (\ O* (\ P*. But a convex (k — 1)-surface 
is differentiable at a point if it is differentiable in k — 1 independent directions 
at that point. 











580 PAUL KELLY AND E. G. STRAUS 


We now return to the proof of Theorem 9. In Case 3 let one component of 
the set of centres be k-dimensional and the other (m — k — 1)-dimensional. 
Let P* be the plane of the k-dimensional component. By Theorem 8 we know 
that every hyperplane through P* is a projective plane of symmetry of S. Now 
by Lemma 2 every (k + 1)-plane P**' through P* intersects S in a surface S’ 
so that S’ — P* has quadric components. Since P* is a plane of symmetry 
of S’ it follows that either S’ (\ P* = 0 or S’ (\ P* is degenerate or S’ — P* 
is not connected. In the last case S’ is differentiable on P* and hence in all 
three cases S’ is quadric. By the same-argument the intersection S” of S with 
any (nm — k)-plane through the (” — k — 1)-dimensional component of the 
set of centres is quadric. 

Now one of the surfaces S’ together with one of the surfaces S”’ determines 
a unique quadric surface S* whose intersections with the planes P**+! > P* 
coincide with P**' (\ S. Hence S* = S and the proof is complete. 

For the sake of completeness we state the following simple consequence 
of Theorems 3 and 4. 


THEOREM 10. Jf a point p on the convex closed surface S is a limit point of 


centres of S which are isolated in P" — S, then the curvature of S has discon- 
tinutties in every neighbourhood of p. 


We conclude this discussion with a few comments on surfaces with a finite 
number of projective centres (not on the surface). The following is an extension 
of Theorems 3 and 4. 


THEOREM 11. If the centres po,..., px on one side of the surface S span a 
k-plane P* which contains points on the other side of S, then the group generated 
by the corresponding involutions yo, . . . , Yx contains infinitely many centres lying 
in P* on the same side of S as p; (i = 0,...,). 


Proof. For k = 1 the proof is that given in the proof of Theorem 3. Now 
assume the theorem true for dimensions less than k, so that, if P* contains 


only a finite number of centres of S, then none of the (k — 1)-planes determined 
by k of the p; contains a point on the other side of S. Thus S (\ P* lies in a 
closed simplex with the vertices p; (i = 0,...,). 

If the number of centres in P* on the side of po were finite then there 
would exist a minimal simplex = in P* whose vertices are centres of S and 
which contains S (\ P* but contains no other centres of S. Now the (k — 1)- 


plane of symmetry of S(\ P* which corresponds to a vertex p of = must 
intersect the side of S (\ P* which is interior to =. Hence that plane must 
intersect at least one of the edges pg of = at an interior point. Thus the 
image gy of g under the involution on p lies interior to the edge pg, in contra- 
diction to the definition of 2. 


THEOREM 12. If the (convex) closed surface S has a finite number of projective 
centres not on S, then the centres on one side span a k-plane P,*, and the corre- 


ee 














PROJECTIVE CENTRES OF CONVEX CURVES 581 





sponding planes of symmetry intersect in a (n — k — 1)-plane P;**-'. In a 
similar manner the centres on the other side determine the planes P;' and P-*“'. 

The planes P,* and P,' lie entirely in the (closed) opposite sides of S and 
Pi C P#-*", Po! C Py". Thus the group T contains the two (not necessarily 
proper) subgroups T,, T2 which are generated by the centres on one of the sides 
of S and every element of T,; commutes with every element of T >. 


Proof. Let p, € P;* be a centre and let 7; be the corresponding involution. 
Since y: maps centres into centres we must have P,'y; = P:'. Now the only 
planes invariant under y; are the planes through p, and those contained in 
the hyperplane of symmetry of y;. Since P,' contains no point on the side 
of S which contains /; it must lie in the plane of symmetry. 

Hence each involution 7; with centre in P;*—and hence every element of 
l',—leaves all centres in P,' fixed. This means that every y; lr, commutes 
with the involutions which correspond to these centres, and hence that 
v1¥2 = veri for all yi © Ti, v2 € Ts. 


REFERENCES 


1. H. S. M. Coxeter, Regular polytopes (New York, 1947). 

2. Paul Kelly and E. G. Straus, Curvature in Hilbert geometries, Pacific J]. Math., 8 (1958), 
119-125. 

3. H. Kneser, Eine Erweiterung des Begriffes ‘‘konvexer Kérper’’, Math. Ann., 82 (1921), 
287-296. 

4. T. Kojima, On characteristic properties of the conic and the quadric, Sci. Rep. Tohoku Univ., 
8 (1919), 67-68. 


University of California, Santa Barbara 
and 
Los Angeles 








A SEMIMODULAR IMBEDDING OF LATTICES 
D. T. FINKBEINER 


1. Introduction. The study of structural or arithmetic properties of a 
general lattice & often can be facilitated by imbedding % as a sublattice of a 
lattice S of a more restricted type whose properties are known. However, if 
S is too restricted, a general imbedding is impossible; for example, S cannot 
be modular because &, as a sublattice of S, would then have to be modular. 
One of the best results of this nature has been given by Dilworth in an un- 
published work in which he shows that any finite dimensional lattice is iso- 
morphic to a sublattice of a semi-modular point lattice (1, pp. 105 and 110). In 
the present paper Dilworth’s imbedding process is modified to obtain a sharper 
result: Any finite dimensional lattice ¥ is isometrically isomorphic to a sub- 
lattice of a semi-modular lattice © which has the same number of points as 
¥ and which preserves basic properties of the join-irreducible arithmetic of &. 

Although the meet-irreducible arithmetic of semi-modular lattices is known 
(2), a corresponding theory of join-irreducible arithmetic remains to be 
developed. The work of this paper suggests that a knowledge of the join- 
irreducible arithmetic of semi-modular lattices would provide a corresponding 
theory for all finite dimensional lattices. 

Aside from possible applications to lattice arithmetic, the imbedding process 
is of intrinsic interest. First a pseudo-rank function s is defined on ¥&. Then 
(§ 3) 2 is imbedded as a sublattice of a lattice J? which is constructed from & 
by introducing between each join irreducible g € 2 and the element c which is 
covered by g a chain of s(g) — s(c) — 1 elements which are both join and meet 
irreducible in Pt. The function s is extended to Mt. In §§ 4 and 5 normal 
subsets of the set Q of all join irreducibles of J? are used to define a dependence 
relation on Q. Finally (§ 6) the subsets of Q which are closed relative to this 
dependence relation form a semi-modular lattice S whose join irreducibles are 
order-isomorphic to Q; S contains a sublattice which is isomorphic to %, and 
the isomorphism is isometric in the sense that if a € & corresponds to a* € S, 
then s(a) is the ordinary rank of a* in ©. 


2. Pseudo-rank functions. If S is a semi-modular lattice of finite 
dimension, then the usual rank function r on © has the properties 
(2.1) r(z) = 0, 
(2.2) if a covers b (a > 5), then r(a) = r(6) + 1, 

Received May 20, 1959. Final composition of this paper was performed while the author 


was a National Science Foundation Faculty Fellow at Princeton University. The author 
gratefully acknowledges the influence of R. P. Dilworth upon the results obtained here. 


582 














IMBEDDING OF LATTICES 583 


(2.3) r(a) + 7r(b) > r(aVU db) + r(and). 


Furthermore, if % is any sublattice of S, then on & the function r satisfies 
(2.3) and 


(2.2*) a > b implies r(a) > r(d). 


THEOREM 2.1. Jf % is amy finite dimensional lattice, there exists an integral 
valued function defined on % which satisfies (2.1), (2.2*), and (2.3). 


Proof. Let u and z denote the unit and null elements of &. For a € & let 
m(a) be the maximal length of all chains from a@ to z, and let s(a) = 2™™ 
— 2”™-"™@. It is readily verified that s satisfies the conditions stated. 

Any function which satisfies the conditions of Theorem 2.1 will be called 
a pseudo-rank function. 


3. Extension of ¢%. Let & be any finite dimensional lattice and s any 
pseudo-rank function on %. The next objective is to imbed & as a sublattice 
in a lattice Pt which has more join irreducibles but otherwise retains the 
arithmetic properties of ¥. Each irreducible g € % covers a uniquely deter- 
mined element c. Let k = s(g) — s(c) — 1. Whenever k > 0 introduce between 
q and ¢ a construction chain of k new elements q;, 


Q@>a>q2>.--> a>. 


Only the maximal and minimal elements of each such chain belong to ¥%, and 
distinct chains are either disjoint or have only the minimal element in com- 
mon. 

Let the set J? consist of the elements of & together with the non-extremal 
elements of the construction chains. It is easy to define formally the ordering 
described above for 22 by superimposing the ordering of ¥ and that of the 
construction chains. Then 9 is a lattice in which the non-extremal elements 
of the construction chains are both meet and join irreducible, and in which 
the remaining elements form a sublattice isomorphic to %. The join irreducible 
elements of J? are those of & together with all non-extremal elements of the 
construction chains; thus J2 and & have the same number of points. 

The function s on & is extended to a function r on J by defining 


r(b) = s(b) BER 


s(be) +7 if BER, 


where 2 is the minimal element of the construction chain in which 6} appears, 
and where j is the length of that chain from 6 to bd». Clearly r satisfies (2.1) 
and (2.2*); furthermore, (2.3) is satisfied by every a which is join irreducible 
in M. 


4. Normal sets of irreducibles. Let Q denote the set of all join irreducible 
elements g # z of M. For any S C Q let n(S) denote the number of elements 











584 D. T. FINKBEINER 


in S, and for each b € M let Q, = {¢ © Q| ¢ C J}. Clearly Q, A Qs = Qan» 


but the corresponding equality for union is not valid. 


Definition 4.1. A subset S C Q is said to be normal if and only if the following 
two conditions are satisfied: 


(N;) If RCS, then n(R) < r(U R), 
(Ne) n(S) = r(U S). 


Normal sets are determined not only by the structure of 2? but also by the 
function r which is not uniquely determined by Mt. The following lemma 
provides the fundamental tool for later proofs. 


Lema 4.1. Jf Sand T are normal sets such thats = \J S € Yandt = UT €, 
there exists a normal set N CS V T suchthatU N = suUt. 


Proof. Since U(S A T) C sq t, we have 
n(S V T) = n(S) + n(T) — n(S A T) 
> r(s) + r(t) — r(U(S A T)) 
> r(s) + r(t) —r(snd >r(sud. 


An inductive argument is used to show that S V T contains a normal subset 
N such that UN = syjt. Let 6 € M be minimal such that sC bCsyt 
and r(b) — r(s) < n(T A (Q, — Q,)). Since s Vt satisfies these two require- 
ments, such minimal elements exist. Choose B C T A (Q, — Q,) such that 
n(B) = r(b) — r(s), and let R = S V B. Then S and B are disjoint, UR C 8, 
and n(R) = 7r(6). To prove that R is normal, it suffices to show that WC R 
implies n(W) < r(U W), for then n(R) < r(U R) < r(b) = n(R). 

Suppose that W C Rexists such that 2(W) > r(U W). Write W = S’ V B’, 
where S’ C S and B’ C B. Clearly S’ and B’ are disjoint, and B’ may be 
assumed to be non-void since otherwise W is a subset of the normal set S. 
Also w = U W Css, since B’ is non-void and disjoint from Q,. First suppose 
w ¢ 2%; that is, w is an irreducible of IM introduced by the construction chains. 
Let w; € % be the minimal element of the construction chain C in which w 
appears. Then 


B’ = (B’ A Qu) V (BAC), 
is a disjoint union. Clearly n(B’ A C) < r(w) — r(w,). Thus 
r(w) < n(w) = n(S’) + n(B’) < n (S’) + n(B’ A Qy,) + r(w) — r(wi), 


r(w,) < n(5S") + n(B’ A Qu; )- 


Let W’ = S’ V (B’ A Qui), and let w’ = U W’. Since U S’ C wm, we have 
w Cw, Then n(W’) > r(w’). If w’ ¢&, the argument may be repeated, 
reducing W’ to a smaller set. This process must end before all the elements 
of B’ are removed because otherwise n(S’) > r(\U S’), contradicting the fact 

















IMBEDDING OF LATTICES 585 


that S’ is a subset of the normal set S. Hence we need to consider only the 
case for which W C R, n(W) > r(U W), and UW € & Then 


US =UWASC(UW) A (US) = wos, 
r(w) < n(W) = n(S’) + n(B’) < r(U S’) + n(B’) 
<r(wo s) + n(B’) < r(w) + 1r(s) —r(wus) + n(B’) 
= r(w) + n(S) — r(wuUs) + n(B’). 


Hence r(w ys) < n(S) + n(B’) < n(S) + n(B) = n(R) = r(b). Since w C s> 
we have sC wus Cb. Also B’ CT A (Qw, — Q,), and therefore 


r(wUs) — r(s) < n(B’) < n(T A (Qu: — Q;)). 


But this contradicts the minimal property assumed for 6. Thus the normal 
set S has been extended to a normal set R by adjoining certain elements of 
the normal set 7. The argument can be iterated, replacing S by R, to con- 
struct a normal set VN C S V T such that U N = svt. 

Observe that in this proof S was augmented by elements of 7 to produce 
a normal set N = S V 7’, where 7’ C 7. The roles of S and T could have 
been interchanged, so there also exists a normal set NV’ = S’ V JT, where 
SCSandUN=UNe=syt. 


LEMMA 4.2. For each b € Wt there exists a normal set B such that \V B = b. 


Proof. This is trivial for all points of 9; we proceed by induction. Let 
r(b) = k, and assume that the lemma holds for all a € It for which r(a) < k. 
If b is irreducible, 6 > c, and by the induction hypothesis there exists a normal 
set C for which c = VC. Then B = C VY (6) is normal and 6 = U B. If b 
is reducible, then 6 € 2 and 6 = s\j¢ for suitable s, ¢ € ¥%. By the induction 
hypothesis there exist normal sets S, T with s = US and t = UT, and 
Lemma 4.1 then guarantees the existence of a normal set B C S V 7 such 
that U B = b. 


5. The normal dependence relation. The next step in the imbedding 
process is to define a dependence relation A between the elements and subsets 
of Q and to develop its properties. For S C Q, let 


S* = {q* € Q| q* Cq for some g € S}. 


Definition 5.1. An irreducible g € Q is said to depend normally on a subset 
SC Q, written g AS, if and only if g CG \ T for some normal set 7 C S*. 
(The notation P A R is used to mean g A R for every gq © P.) 

As immediate consequences of this definition we have 


(5.1) S* A S for every non-void S C Q, 
(5.2) if SAT, then S* AT, 
(5.3) ifqg AS, thengoUS. 











586 D. T. FINKBEINER 


LeMMaA 5.1. A also satisfies 


(Al) q < qimplies 7 AS V gq foranyS CQ, 

(A2) g AS and SAT wmply q AT, 

(A3) q Aq implies 7 C gq, 

(44) qgASand S Aq imply q € S, 

(A5) if q’ C ¢ implies q’’ AS, theng SS V q implies either g 4 S 
org ASV q. 


Proof. (41) follows directly from (5.1), while (43) and (A4) both follow 
from (5.3). Consider (A2), and let g AS and S A T; there exists a normal set 
M C S&* such that gC U M = m. By (5.2) MAT. If m¢&, m must be 
join-irreducible, which implies g € S* and g AT. Hence consider m € &. 
Write M as a disjoint union, M = (M A 7*) V M,. If M, is void, M C T*, 
and g AT. Otherwise for each g,; € M, there exists a normal set 7, C 7* 
such that g, C U T, = t;. If t, ¢ 2 for some g; € M, then ¢, is join-irreducible, 
and g; € 7*, contrary to g, € M,. Hence ¢t; € & for each i. Apply Lemma 4.1 
a finite number of times to obtain a normal set 

RO V 7, 
qieM) 
such that UR = Ut, = t€ & Clearly RC T*. Apply Lemma 4.1 again 
to Rand M to obtain a normal set N of the form V = R V M’, where M’ C M 
and U N =tum. Then N C 7%, for if gq, € N A M,, then 


VU(RV a) =tuaGtuUrk=t 


Hence 
r(U(R V q)) =r <r) +1 = n(R)+1=n(RV @, 


which contradicts the normality of VN. But go mC UN, where JN is a 
normal subset of 7*, so g AT. 

To verify (45), assume that g” 4S for all qg’’ C q’, that qAS V q’, but 
that g AS. Then by (5.1) and (A2) q’ is the only element of (S V q’)* which 
does not depend normally on S. By the definition of A, there exists a normal 
set TC (S V q’)* such that gC UT =t. Since g AT, we may assume 
q € T, for otherwise T A S, from which follows g A S, contrary to hypothesis. 
Thus we write T = 7’ V q’, where 7’ AS. We assert that 7” V gq is normal 
and t = U(T”" V q). Let R be any subset of 7’ V g. If RCT’, then RCT, 
so n(R) < r(U R) since T is normal. If R Z 7’, then R = R’ V gq, where 
xe 2 G7. tam 
(5.4) n(R’) = n(R) —1 <r(UR’) <r(U R). 


Suppose Ry = Ro’ VqgCT’ Vq exists such that m(Ro) — 1 = r(U Rp). 
Then from (5.4), m(Ro’) = n(Ro) —1=7r(U Ro’) =r(U Ro). Since 





n | 


no 


QM 


6.4 
ha 
by 





























IMBEDDING OF LATTICES 587 


R, ST’ CT, Ro’ is normal. Also g C U Ro = U Ro’. Hence g A Ro’; but 
Ryo AS, so gAS which is a contradiction. Thus from (5.4) we obtain 
n(R) <r(U R) for all RCT’ V q, and T’ V q satisfies (N;). But since 
; 7 =T7' V qd is normal and g C U7, 


n(T’ V q) = n(7T") +1 <r(U(T" V gq)) 
<r(UT) =r(U(T’ V 7’)) 
n(T) = n(T’) + 1. 


Therefore, both of these inequalities must be equalities, and 
(5.5 n(T’ V q) = r(U(T’ V g)) = r(UT). 


: Thus 7” V q satisfies (N2) and is normal. But also U (7” V g) C UT, so 
(5.5) implies that equality holds. Since g’ € T, ¢ CG U(T" V q@). Hence 
g ST’ V gq, and g AS V g. This completes the proof of Lemma 5.1 





a 6. The normal imbedding. It was shown by the author (3) that any 
yt relation A on a partially ordered set Q determines a complete semi-modular 
: lattice S whose set of join irreducibles is order isomorphic to Q, provided A 
satisfies the five properties of Lemma 5.1. The elements of S are the closed 
| subsets of Q where S C Q is said to be closed if and only if g AS implies 
q © S. The closed subsets determined by the normal dependence relation 
: form the imbedding lattice which was described in the introduction. Recall 
[ that for each 6 € M, Q, denotes the set of all g © Q such that g C b. 
THEOREM. Let © be the lattice of all subsets of Q which are closed under the 
normal dependence relation. Then 
(6.1) S is a complete semi-modular lattice whose set of join irreducible 
isomorhphic to Q under the mapping q — Q,, 
(6.2) S, M, and L& have the same number of points, 
} (6.3) the mapping b — Q, is a one-to-one mapping of IW onto a lattice within 
. © and an isomorphism of % onto a sublattice of S, 
(6.4) for every b © M, r(b) is the ordinary rank of Q, in S, 
(6.5) properties of the join arithmetic of % are preserved in ©. 
| Proof. The precise meaning of (6.5) is contained in the statement of Lemma 
e | 6.4. Theorem 3 of (3) establishes (6.1). Hence S and M (and therefore ¢ 
on | have the same number of points. The remaining statements are established 
1 by a sequence of lemmas. 
: LemMA 6.1. For each b € M, Q, © S. 
Proof. Since Q, = Q,* and b = U Q,, Q, is closed for each 6 © M. 
LEMMA 6.2. For all a,b € %, Qa U Qo = Quauo- 
). Proof. Q..UQ» is the smallest closed set containing Q, V Q); hence 


0. U Qo © Quay». By Lemma 4.2, there exist normal sets A C Q, and BC Q, 








588 D. T. FINKBEINER 


such that \/ A = a and LU B = b. Apply Lemma 4.1 to obtain a normal set 
NCA VB such that UN = U(A V B) =ay bd. If g © Qu» then g AN. 
This implies Q.y, © Q. U Qo, so equality holds. 

Thus the mapping b — Q, preserves joins of elements of &; since it also 
preserves intersections, % is mapped isomorphically onto a sublattice of S. 
Clearly, for a,b € M, a C b if and only if Q, C Q,. Hence the mapping of 
M into S is order-preserving. Intersections are preserved, but joins are not, 
in general. Since Q, V Q, © Quy, the image of M forms within © a lattice 
which is isomorphic to 2 but which is not necessarily a sublattice of S. 

To prove (6.4) we use an inductive argument. For 6 € M if r(6) = 1, then 
Q, = (6) isa point of S. Suppose the rank in © of Q, is r(c) for all c C b. If 
b is irreducible and } > c, then Q, = Q. V 6 > Q, in S, so r(b) = r(c) +1 
is the rank of Q, in GS. If 6 is reducible in M, then b € &. Let b > c in ¥& and 
let g € & be such that g > cq q in &. Then 


j=r(b) —r(c) < r(q) —r(eng =k. 

In M there exists a chaing = q@ > G@-i1>... >a >cng.LettS:i=QO. Vr 
V...V qq; for 1 < i < k. Then we assert 

(a) S,, Se,...,S;~-1 are closed, and 

(b) Q AS;. 
If these two statements are valid, then in S we have 

Q@>S5,:1>...>&>@,, 

so the rank of Q, is r(b) = r(c) + 7. Thus (6.4) follows from the next lemma. 

LEMMA 6.3. In 2 let b > c, where b is reducible, and let q, C 6 be such that 
de > Gx (\ . Let the construction chain in MM which is headed by q, be qx > Ge-1 


>... > a> ane wherek >r(b) —r(c) =j. LA¢S;=QAVav...Va 
for 1 <j. Then 


(a) S,,...,S;-1 are closed, and 

(b) Q, AS;. 

Proof. For i < j let g A S;; there exists a normal subset V C S;* = S; such 
that gC UN. First assume U N € & and cyUN = 5. Let M CQ, be 
normal such that LV M = c. By Lemma 4.1 extend V by adjoining elements 
of M to obtain a normal set B C S; for which LU B = b. Then 


n(B A Q.) > n(B) —i=r7r(b) —i>7r(b) —j=r(c) > r(U(BAQ.,) 


This contradicts the normality of B. Hence either U V ¢% or cy UN Cb. 
In the latter case \U N C csinceb>c. ThengA N CQ, C S;1feUyUN b 
then U NV ¢2%, so \U N = q; for some s = 1,..., i. Then 


gq4NCQ,, CS; 


Hence 5S, is closed. 





6 








ee 








IMBEDDING OF LATTICES 589 
To prove (b) we show that P= MV qvV...Vq; is normal. Since 
U P = b it will follow that g A P C'S, for all ¢g € Q,. Clearly 
n(P) = n(M) +j = r(c) +7 = (6). 
Let 7 C P, and write T as a disjoint union, 
T = CV D, where CC Mand DC {q,,..., q;}. 


IfU T ¢&, either D is void or U T = q, for some s = 1 
case TC M sor (UT) < n(T); in the latter case 


TC Q, SF (iV... Vas) V (C A Qeng,). 


But 
n(C A Qene,) < (A a) 
since C C M. Thus 
n(T) << strlend) =r.) = r(U T). 
Finally suppose U T € &. Then 
n(T) = n(C) + n(D) cr(UC)+j7<r(CQUT) +7r(d) — ro), 
since \U CC cq UT. But since r satisfies (2.2) on &, 
rienUuUn<raotr(UT) —reeVUV?). 


Hence n(T) < r(U 7), and P is normal. This completes the proof of Lemma 
6.3 and consequently (6.4). 


LemMA 6.4. For b € 2 and q;* € Q, let 


be a reduced join representation having the least possible number of components. 


Then there exist q, € YX, 7 = 1,..., m, such that both 
b= U 4 
and 
Q» _ U Qe 


are reduced representations. 


Proof. From the isomorphism, for g © % the representation 


m 


b=Udq 


i=1 


is reduced in ¥& if and only if 











590 D. T. FINKBEINER 


is reduced in S. Thus the join representations in ¥ are carried intact into S; 
however, some irreducibles of @ may not be the image of an irreducible of 
@. If 

\. 


m 
= U Q, 
i=1 
for g;* € Q, let gq; € Y be the maximal element of that construction chain in 
which g,;* appears. Then 


Q = U QW. 


If no representation of Q, has fewer than m components, this representation 
is reduced, as is 
m 


b= Ud. 


i=] 
This completes the proof of the main theorem. 


Our concluding remarks are directed to the problem of determining in 
what sense the normal imbedding is minimal. First, it is clear that among 
all semi-modular lattices which contain % as an isometric sublattice, S has 
the fewest points, and also the smallest possible number of join irreducible 
elements. Furthermore, if % is already semi-modular, then the normal im- 
bedding lattice S, based on the usual rank function for ¥, is isomorphic to 
g. Even if 2 is not semi-modular, © is isometrically isomorphic to a sublattice 
of the semi-modular point lattice of Dilworth’s imbedding. One might suspect, 
then, that © is isomorphic to a sublattice of any semi-modular lattice which 
contains % as a sublattice and preserves the rank function originally defined 
on %. However, a simple counter-example reveals that no general imbedding 
exists which is minimal in this sense. Consider the lattice diagrams shown 
below, in which the lattice %, whose elements are denoted by small circles, 
has been imbedded isometrically in the semi-modular lattices S and ®, using 
height on the diagram as rank function. S is the normal imbedding lattice 
of this paper, and ® is clearly the smallest imbedding lattice possible, yet 
neither is a sublattice of the other. 


~“ ‘ 


GK 


eae 





of 


in 


ion 


in 
yng 
has 
ble 
im- 
to 
‘ice 
ct, 
ich 
ied 
ing 
wn 
les, 
ing 
tice 


yet 


me 


IMBEDDING OF LATTICES 591 


REFERENCES 


1. G. Birkhoff, Lattice Theory (rev. ed.), Amer. Math. Soc. Coll. Pub., 24 (1948). 

2. R. P. Dilworth, Arithmetic theory of Birkhoff lattices, Duke Math. J., 8 (1941), 286-299. 

3. D. T. Finkbeiner, A general dependence relation for lattices, Proc. Amer. Math. Soc., 2 (1951), 
756-759. 


Kenyon College 











INTERSECTION IRREDUCIBLE IDEALS OF A 
NON-COMMUTATIVE PRINCIPAL IDEAL DOMAIN 


EDMUND H. FELLER 


Introduction. Let R always denote a fixed non-commutative principal 
ideal domain. A right (left) ideal aR (Ra) is termed right (left) (\ irreducible 
provided it is not the intersection of two right (left) ideals that properly in- 
clude it. In this case, the element a is called right (left) (\ irreducible. 

Since R satisfies the A.C.C. for right ideals every right ideal aR can be 
written in the form aR = a,Rf\a2R\...(\a,R, where a;R properly in- 
clude aR and is right () irreducible, i = 1, 2,..., 2. We shall investigate pro- 
perties (including primary properties) of right ()\ irreducible one-sided and 
two-sided ideals of R. These properties will depend on the results given in (1) 
and (2, chapter ur). 

An element a is irreducible if it is not zero or a unit and has no proper fac- 
tors. In tnis case aR (Ra) is a maximal right (left) ideal. 


1. Right (\ irreducible one-sided ideals. From Theorem 2 of (2, p. 31) 
it follows that the zero ideal is right and left ()\ irreducible. This is also trivially 
true for R. The special case where a is irreducible will be discussed briefly in 
§3. Hence in this paper, unless otherwise stated, the ideals aR mentioned 
will not be 0, R or maximal. 


THEOREM 1.1. For aR C R the following statements are equivalent: 
1. a is right () irreducible. 


2. aR is contained in a unique minimal right overideal not equal to R. 
3. Ra is contained in a unique maximal left ideal. 

4. If a = be = b’c’ where c’ and ¢ are irreducible then Re = Re’. 

5. If a = bc = b’c’ where c’ and ¢ are irreducible then bR = b'R. 


Proof. 1 «+ 2. Let a be right ( irreducible. From Theorem 1 of (2, p. 31) 
we have that every right ideal has at least one minimal overideal. If 6R and 
dR are distinct minimal overideals of aR then aR = bR(\ dR—a contradic- 
tion. Conversely, let bR be the unique minimal overideal of aR. Suppose aR = 
dR (\ eR where dR and eR properly include aR. Hence bR C dR and 6R € eR 
and aR C bR C dRC\ eR—a contradiction. 

In order, we now show that 2 > 3 —~ 4-5-2. To show that 2 — 3 let 
bR be the unique minimal overideal of aR. Then a = bc where c is irreducible 
(2, p. 34). Hence Ra C Rc. Suppose Ra C Re’ where Rc’ is a maximal left 
ideal. Then a = de’ and aR CdR. Therefore, bR CdR, that is, b = dk. 
Hence a = dc’ = dkc which implies c’ = kc. Thus k is a unit and Re’ = Re. 





Received April 23, 1959. 


592 








— ah Url 














INTERSECTION IRREDUCIBLE IDEALS 593 





3 — 4. Let Rc be the unique maximal left ideal containing Ra. Ifa = b’c’ = 
b’’c’’ where c’ and c” are irreducible, then Ra C Re’ and Ra C Rc’. Hence 
Re = Re’ = Re”. 

4—5. Let a = b’c’ = bc’ where c’ and c” are irreducible. By 4 we have 
Rc’ = Re“. Hence there exists a unit u such that c’ = uc’. Therefore, 
a = b'uc” = bc’. Thus b’'u = b” and b’R = b’R. 

5 — 2. Suppose bR and b’R are minimal overideals of aR. Then a = bc = b'c’ 
where c and ¢’ are irreducible. By 5 we can conclude that bR = b’R. 

From 3 of this theorem the following corollary is immediate. 


CoROLLARY 1.1. Let aR be right (\ irreducible. If Ra C Rdy C Rdg CC... C 
Rd, C R is a composition series where Rd, is the unique maximal left ideal con- 
taining Ra, then d,, i = 1,2,..., , is right C\ irreducible. 


From the corollary we have 


COROLLARY 1.2. Let a be right (\ irreducible. Ifa = kd = k'd' where Rd = Rd’ 
then a = kbc = k'b'c’ where bR = 0'R. 


STATEMENT 1.1. Let a be right (\ irreducible. If 6R is the unique minimal 
overideal of aR and a = bc then 


cR = {aR|d}, = { x| bx - aR, x c R}. 


Proof. Certainly cR C {aR\d},. If d € {aR\b}, then bd = am = bem. 
Hence d = cm and d € cR. 

From (2, p. 34) we know that every element may admit several factoriza- 
tions as a product of irreducible elements. Part 4 of Theorem 1.1 tells us that a 
is right (\ irreducible if and only if for any two factorizationsa = a,@2...a, = 
bib... . 6, where the a,’s and 3b,'s are irreducible we have Ra, = Rb,. 


2. Primary properties. The ring R can be considered as an A — K 
module as defined in (1) by taking as A the ring of left multiplications and as 
K the ring of right multiplications. By Theorems 2.1. and 2.2. of (1) we have 
the following. 

Let aR be right ()\ irreducible. The normalizer (centralizer) B of aR is 
{b|ba € aR}. Then P = {x\|x" € aR, x € B} is a completely prime two-sided 
ideal of Band cx € aRforc € B,x ¢€aRimpliesc € P. 

Since aR is a right ideal we may consider R — aR as an R module. Let us 
now apply Theorem 1 of (3, p. 25) to this R moduie. Since R > 1, the module 
R — aR is cyclic. Then I = {x\Ix = 0, x € R} is a right ideal of R where 
I and 0 denotes the cosets of 1 and 0 in R — aR. Certainly J = aR. Since 
R > 1 the ideal K = {k|RR CaR,k € B} = aR. Applying the theorem above 
we have the ring of endomorphisms C of R — aR is B/aR. 

Since a # 0 the module R — aR satisfies both chain conditions for sub- 
modules. Since aR is right ()\ irreducible certainly R — aR is indecomposable. 
Applying Theorem 3 of (2, p. 57) then the ring of endomorphisms of R — aR, 








594 EDMUND H. FELLER 


which is B/aR, is completely primary, that is, if P = {x|x"€ aR,x € B} then 
(B/aR)/(P/aR) = B/P is a division ring. Hence P is a maximal right ideal 
of B and we have proved 


THEOREM 2.1. Jf aR is right (\ irreducible and B the normalizer of aR in 
R, then the radical P of aR in B is a maximal right ideal and B/P is a division 
ring. 


LEMMA 2.1. Let aR be right (\ irreducible and P the radical of aR in B. If 
bR is the unique minimal overideal of aR then xbR C aR forx € P. 


Proof. lf x € P then x" € aR for some integer n. If nm = 1 the statement is 
obviously true. Assume m > 1 and x" € aR, x*-' ¢aR. Since x*-' ¢aR then 
bR C x*"'R + aR. Hence b = x*~'g; + age, xb = x"g, + xage. Since x € B 
and x” € aR we have xb € aR. Hence xbR C aR. 


STATEMENT 2.1. Let aR be right (\ irreducible and P be the radical of aR in its 
normalizer B. If bR is the unique minimal overideal of aR then P = {aR\b}, = 
{x|xb € aR, x € R}. 


Proof. By Lemma 2.1 certainly P C {aR\b},. If x € {aR\b}, then xb © aR 
and xbR C aR. Since aR C bR certainly x € B. From the discussion at the 
beginning of this section and since xb € aR, x € B, b¢aR then x € P. 


3. Summary. In the commutative case for a principal ideal domain an 
ideal aR is ()\ irreducible if and only if aR = c*R where c is irreducible. Here 
aR C eR, ¢ is irreducible, cR is maximal and cR is called the radical (aR)? of 
aR. (Read (4, chapter 1).) 

In the non-commutative case the radical P of aR in B has many of the 
properties of the radical in the commutative case. These are: (1) if x C P 
then x"€ aR. (2) P is maximal in B. (3) P = {aR\b}, by Statement 2.1. 
(4) Ifxd € aR, x€ B,d¢aR then x€ P. 

If Re is the unique maximal left ideal containing Ra then c has many of the 
properties of the element which generates the radical in the commutative case. 
These are: (1) Re and cR are maximal. (2) c is irreducible. (3) c is right factor 
of a. (4) cR = {aR\b}, by Statement 1.1. 

We shall now consider the case where a is irreducible. Since aR is a maximal 
right ideal certainly aR is () irreducible. By the corollary of (3, p. 26) we have 
that B/aR is a division ring where B is the normalizer of aR. Hence in this 
case aR = P, the radical of aR in B. 


4. Intersection irreducible two-sided ideals. Suppose a*R = Ra* is 
a two-sided ideal of R which is right () irreducible. The normalizer B of 
a*R is R. Hence the radical P of a*R in R is a two-sided ideal of R and R/P 
is a division ring. Therefore, the unique maximal left ideal Rc* which contains 
a*R is equal to P and is a two-sided ideal. Thus Rc* = c*R = P. Certainly 


So ee eon 








f 


En tee 





ee ., 





INTERSECTION IRREDUCIBLE IDEALS 595 


c*R is a unique maximal right ideal containing a*R. Hence by Theorem 1.1 
a*R is also left ()\ irreducible. We have proved 


THEOREM 4.1. Jf a*R = Ra* is right (\ irreducible then a*R is left irreducible. 
In addition if P is the radical of a*R in R then R/P is a division ring and P = 
c*R = Rc* where c* is irreducible. 


STATEMENT 4.1. Jf p*R = Rp* where p* is irreducible then (p*)"R = 
R(p*)" is right (\ irreducible for all integers n. 


Proof. \{ nm = 1 this is obvious. For m > 1 then (p*)"R C p*R and assume 
(p*)"R C Re where c is irreducible and p*R # Rc. Then (p*)* € Re for some 
integer k. If k = 1 then p*R = Re — contradiction. If k > 1 and (p*)*' ¢ 
Re, then R = Re + R(p*)*—' and thus 1 = dic + d,(p*)*-'. Multiplying on 
the left by p* we have p* = p* dic + p*d2(p*)*—'. But p*d, = dsp* and 
(p*)* = duc. Then p* = p*dic + dade. Hence p* € Re and p*R = Re—a 
contradiction. Since the assumption that p*R # Rc always leads to a contra- 
diction, we can only conclude that p*R = Rc. Thus (p*)"R is contained in a 
unique maximal left ideal and is, therefore, right (\ irreducible. 


From Theorem 2.1 of (1) it follows that a*R = Ra* is primary in the sense 
that cd € a*R implies c* € a*R if d ¢a*R and d" € a*R if c da*R. 


THEOREM 4.2. A two-sided ideal a*R = Ra* is right (\ irreducible if and only 
if a* = u(p*)" for some integer n, where u is a unit of R, p*R = Rp*, and p* 
is irreducible. 


Proof. Suppose a*R is right () irreducible. Then a*R C c*R where c* is 
irreducible and (c*)" € a*R for some integer n. If a* = bc* and if c* ¢ a*R 
then 6 € c*R and b = dc*, a* = d(c*)*. This process continues until a = 
k(c*)" = ea. Thena = kea and k isa unit. The converse follows from Statement 
4.1. 


5. An application. If R is real closed field then a polynomial f(x) in 
R(./—1)[x] is CO irreducible if and only if f(x) = u(x — r)" where u isa 
unit and r € R(/1—). Thus in this case all the roots of f(x) are equal. 

Let Q be the ring of quaternions over a real closed field. From the discussion 
given in (2, p. 36) a polynomial f(x) inQ[x], where ax = xa for a € Q, is 
irreducible if and only if it is linear. In addition r is a right-hand root of f(x) 
if and only if f(x) = g(x) (x — r). 


THEOREM 5.1. A polynomial f(x) € Q[x], where Q denotes the ring of 
quaternions over a real closed field, is right ()\ irreducible if and only if all its 
right-hand roots are equal. 


Proof. Letr, and r2 be right-hand roots of f(x) which is right () irreducible. 
Then f(x) = gi(x) (x — 11) = go(x) (x — re). From Theorem 1.1 then 
u(x — r;) = (x — re) where uw is a unit. Hence r; = fro. 











596 EDMUND H. FELLER 


Conversely suppose all the right-hand roots of f(x) are equal and let f(x) = 
qi(x) (ax + b) = g2(x) (cx +d). Then a~'b = c“'d. Hence ca'(ax + b) = 
(cx + d) and by Theorem 1.1 we have f(x) is right ()\ irreducible. 

If the coefficients of f(x) are in the centre of Q and f(x) is right ()\ irreducible 
we conclude from this section and §4 that f(x) = u(x — r)" where u is a unit 


andr € Q. 


REFERENCES 


1. E. H. Feller, The lattice of submodules of a module over a non-commutative ring, Trans. Amer. 
Math. Soc., 81 (1956), 342-357 

2. N. Jacobson, The theory of rings, Math. Surveys, 2 (1943). 

. N. Jacobson, Structure of rings, Amer. Math. Soc. Coll. Publ., 37 (1956). 

4. D. G. Northcott, Ideal theory, Cambridge Tracts in Mathematics and Mathematical 
Physics, 42 (1953) 


w 





oe 


























ON CONTINUOUS REGULAR RINGS AND 
SEMISIMPLE SELF INJECTIVE RINGS 


YUZO UTUMI 


1. Introduction. Brainerd and Lambek (2, Corollary 4) have proved 
recently that any complete Boolean ring is self-injective. It is easy to see that 
every complete Boolean ring is a continuous regular ring, that is, a regular ring 
of which the lattice of principal left ideals is continuous. This suggests that in a 
continuous regular ring it might be possible to prove the injectivity. However, 
a simple example (Example 3) shows that the conjecture is not true in general. 
Our main theorem is the following. Every continuous regular ring with no 
ideals of index 1 is (both left and right) self-injective (Theorem 3). 

It is known to Wolfson (13, Theorem 5.1) and Zelinsky (15) that the ring 
S of all linear transformations of a vector space of dimension > 2 over a 
division ring is generated by idempotents and also by non-singular elements. 
We shall in the present paper prove this under the assumption that the ring 
is a semisimple one-sided self-injective ring with no ideals of index 1 (Theorem 
2). Since the above S satisfies this assumption (10, (5.1)), our theorem may 
be regarded as a generalization of the result of Wolfson and Zelinsky. 

The author wishes to express his appreciation to J. Lambek for his interest 
and encouragement. 

2. Preliminaries. Throughout this paper the word ideal without modifier 
will always mean two-sided ideal. 

A ring is said to be a semisimple I-ring if every non-zero left ideal contains 
a non-zero idempotent. 

For any left ideal A of a ring S the least upper bound of all integers r such 
that A contains a direct sum of r mutually isomorphic non-zero left ideals 
of S will be denoted by m(A). 

We say that a ring S is of index m, if S contains nilpotent elements of 
index m (that is, a™-' + 0, a™ = 0) but no elements of higher index. We shall 
denote the index of S by i(S). 

(S1) For any left ideal A of a semisimple J-ring S we have m(A) = 
m(A/N(A)) = i(4/N(A)) where N(A) denotes the radical of A (11, Theorem 
3). 

(S2) Every idempotent of a ring S is central if i(S) = 1. 

(S3) An idempotent e of a ring S with no nilpotent ideals is central if and 


only if eS C Se. 


Received August 10, 1959. 


597 











598 YUZO UTUMI 


A ring is called regular if for any x there is an element y with xyx = x. 
The lattice of principal left (right) ideals of a regular ring S will be denoted 
by “(S) (A(S)). 

(Ri) For any two idempotents e, f of a regular ring S if eSf +0 then0 
+ Se’ C Se, Sf’ C Sf and Se’ = Sf’ for some e’, f’. 

In fact, let 0 +x € eSf. The right multiplication by x gives a non-zero 
homomorphism @: Se — Sf. Then ker@ (= the kernel of @) is a principal left 
ideal. Let Se = ker@ @ Se’ and @(Se) = Sf’. It is then evident that Se’ ~ Sf’. 

(R2) Let S be a regular ring and suppose that -/ (S) is complete. Then S 
has a unit. 

In fact, S = U,.s Sx is generated by an idempotent e: Se = S. By (S3) e¢ is 
central and so e = 1. 

A ring S is said to be a Boolean ring if S satisfies the identity x? = x. It is 
easily verified that the identity x? = x implies the identities xy = yx and 
x +x = 0. Every Boolean algebra may be regarded as a Boolean ring with 
uit, and vice versa (1, p. 154). 

(R3) The set of all central idempotents in a ring S with unit forms a Boolean 
algebra (see (6, p. 49)). 

For any given module A and a sub-module B we say that A is an essential 
extension of B if every non-zero submodule of A has a non-zero intersection 
with B. Notation: B C’ A (see 4). 

A module Q will be referred to as an injective module if Q is a direct summand 
of every extension module. 

If an injective module Q is an essential extension of a module A we say that 


Q is a minimal injective extension of A. Notation: Q = A. 
(Ql) If A ~ B, the isomorphism is extended to that of A and B. 
(Q2) If A = A, ®... @ 4,, then A = A, ©... @Ay,. 


A ring S is said to be left (right) self-injective if S has a unit and the left 
(right) S-module S is injective. A ring which is both left and right self-injective 
is called a self-injective ring. 

For any semisimple J-ring S we can construct the maximal left quotient ring 
S of S (see (7), (10), (14), and also (5)). 

(Q3) 8S is a left self-injective regular ring. 

(Q4) § is an extension ring of S. The left S-module § is an essential extension 
of the left S-module S. 

(Q5) If S has unit 1, then 1 is also unit of 8. 

A lattice L is called upper continuous if L is complete and satisfies the 
following 

Condition (C). 


(1) (U aa) 16 = U (aa) Bb) 


for every chain {a,} and every element 5. 
The following two conditions for continuity also may frequently be seen 
in the literature: 





en 








REGULAR AND SEMISIMPLE SELF INJECTIVE RINGS 599 


Condition (C’). (1) holds for every well-ordered ascending chain {a,! 
and every 6 (9, IIIs, p. 3). 

Condition (C”). If a subset {a,} of L is directed, that is, for any de, ag 
there is a, with a4, a3 < a,, then (1) holds for every 6 (8, Definition 1.14, p. 10). 

By virtue of (8, II, p. 237) Conditions (C’) and (C”) are equivalent. 
Evidently Condition (C) implies Condition (C’), and also Condition (C’’) 
implies Condition (C). Therefore, these three conditions are all equivalent 
for every complete lattice L. 

(Cl) A complete complemented modular lattice L is upper continuous if 
and only if Z satisfies the following 

Condition (M). Let T be a subset of L. If (U..2a) (\ 6 = 0 for every finite 
subset F of T, then (Ugra) (\ 6 = 0 (8, (a), p. 11). 

In fact, by Hilfssatz 1.7 and Anmerkung 1.11 of (8, p. 11) Condition (M) 
is equivalent to Condition (C”) for any complete relatively complemented 
lattice L. 


DEFINITION. A regular ring S is said to be left (right) continuous if Z(S) 
(2(S)) is upper continuous. 

A continuous regular ring (8, Definition 1.1, p. 156) is a ring which is both 
left and right continuous. By (R2) every left (right) continuous regular ring 
has a unit. 

We have proved in the proof of Corollary 1 of (12) the following 


THEOREM 1. Every semisimple left self-injective ring is a left continuous 
regular ring. 


COROLLARY. Every semisimple self-injective ring is a continuous regular 
ring. 


3. Generators of self-injective rings. We shall denote the left (right) 
annihiliator of a subset T of a ring by 1(T) (r(T)). 


Lemma 1. Let S be a regular ring and suppose that &(S) is complete, then 
every left annthilator ideal A is principal. 


Proof. By (R2) S has a unit 1. Denote by Sf the meet of S(1 — e) for all 
idempotents e € r(A). For any x € r(A) there is an idempotent e’ such that 
xS = e’S. Then, e’ € r(A) and Sf C S(i — e’), whence Sfx C S(1 — e’)e’S = 0 
and Sf C l(r(A)) = A. On the other hand, if a € A, ae = O for all idem- 
potents e € r(A), and soa € S(1 — e), hence a € Sa C Sf: this implies that 
A C Sf. Therefore, A = Sf is principal, as desired. 


LemMMA 2. Let S be a left continuous regular ring. Then, for any left ideal A 
there is a principal left ideal Se such that (1) A C’ Se and (2) Se C Sf whenever 
AC Sf. In case S is a semisimple left self-injective ring, Se is a minimal injective 
extension of A. 











600 YUZO UTUMI 


Proof. We set Se = U.4Sx. Of course, A C Se. If A C Sf, then Sx C Sf 
for every x € A, and Se C Sf. Suppose that A (\ Sg = 0. For any finite 
subset F of A we have (UyxSx) 1\ Sg = (DonrSx) 1 Sg C A 0) Sg = 0. 
Hence Se (\ Sg = 0 by (C1). This implies that A C’ Se. 

Next, suppose that S is a semisimple left self-injective ring. Then, by 
Theorem 1, S is a left continuous regular ring, and hence A C’ Se for some 
e € S. Since Se is a direct summand of S, Se is injective (as a left S-module) 
(3, p. 8). Therefore, Se is the minimal injective extension of A, completing 
the proof. 


LemMMA 3. Let S be a semisimple left self-injective ring. Suppose that (i) 
m(Sx) = 1, (ii) Se does not contain any left ideals isomorphic to Sx, and (iii) 
e = e®. Then S(1 — e) contains a non-zero central idempotent. 


Proof. By Zorn’s lemma there exists a maximal isomorphism @ of which 
the domain D and the image E are contained in Sx and Se respectively. By 
Lemma 2 Dand E have minimal injective extensions Dand E such that D C Sx, 
EC Seand D, E € #(S). By (Ql) @ is extended to an isomorphism of D and 
E. In view of the maximality of @ this implies that D = D, E = E, and hence 
also that D, E € #(S). Let E = Se’, Se =Se' @ Se’ and Sx = D @ Sf, e’, e” 
and f being idempotents. Now, we shall show that fSe = 0. In fact, if fSe’ +0, 
then by (RI) there exist Sf; and Se,’ such that 0 + Sf; C Sf, Se;’ C Se’ 
and Sf, ~ Se,’. It follows from this that Sx contains two mutually isomorphic 
left ideals @-'(Se;") and Sf;. Since (@-'(Se’)) (\ Sf C DO\ Sf = 0, this con- 
tradicts the assumption (i). Hence fSe’ = 0. On the other hand, if fSe’’ + 0, 
then by (R1) 0 + Sfz C Sf, Seo” C Se” and Sfz—~ Ses” for some fo, e2'’ € S. 
This shows that @ may be extended to an isomorphism of D @ Sf: onto 
E @ Se,’’, and we obtain a contradiction. Hence fSe’’ = 0. Therefore, we have 
fSe = f(Se’ + Se’’) = 0. Now, it is evident from the assumption (ii) that, 
Sf + 0. Thus, 0 + f € 1(Se) and 0 + /(Se). By Theorem | fF (S) is complete, 
and hence /(Se) € “(S) by Lemma 1. Since (Se) is an ideal, /(Se) is generated 
by a central idempotent c by virtue of (S3). ce € (l(Se))(Se) = 0 and 
c € S(1 — e). Therefore S(1 — e) contains a non-zero central idempotent 
c, as desired. 


LemMA 4. Let S be a semisimple left injective ring and A a principal left 
ideal. Then, for any given positive integer n A has a decomposition A = B@C 
such that (i) B is a direct sum of n mutually isomorphic principal left ideals and 
(ii) C is a left ideal with m(C) < n. 


Proof. By virtue of Zorn’s lemma it is not too hard to see that there exists 
a maximal left ideal B of S such that (i) B C A and (ii) B is a direct sum of n 
mutually isomorphic left ideals B,; of S. By Lemma 2 there is a minimal injec- 
tive extension B of B such that B € Y(S) and BC A. By (Q2) B=B, 
@®...@ B, and B, ~ B, by (Q1). In view of the maximality of B we have 
B= Band Be ¥(S). Let A =B @® C. If m(C) > n, there is a left ideal B’ 











SS «68h 


— 


— we 





REGULAR AND SEMISIMPLE SELF INJECTIVE RINGS 601 


such that (i) B’ C C, (ii) B’ = By’ @... @B,’ for some left ideals B,’ and 
(iii) B,’ ~ B, for every i, j. Then, B @ B’ = (B, @ B,’) @... ® (B, + B,’) 
CA and B, @B; ~B,@B,. This contradicts the maximality of B. 
Therefore m(C) <n, as desired. 


LemMA 5. Let S be a ring and @ an endomorphism of the left S-module S. 
Suppose that there are mutually isomorphic left ideals A, Az such that S = A, 
@ ker 6 and Az is a direct summand of ker 6. Then @ can be represented as a 
sum of products of idempotent endomorphisms of the left S-module S. 


Proof. Let ker@ = Az @ A;. Denote by « and e, the projections of 
S = A, @ker 6 on A, and ker @ respectively, and by w the given isomorphism 
of A, onto A». Then we have the following decompositions of S: 
(1) S = A,’ @®A2@ Az where Ay’ = {x + w(x);x € Aj}; 
(2) S = A; @A2’ @Asz where Ay’ = {e, 0 w-'(y) + v3 9 € Aa}; 
(3) S = A,” + ker @ where A,” = {x — €28 (x); x € Aj}. 


We shall use the following notations: 
€; = the projection of S on A» with respect to (1); 
€, = the projection of S on A, with respect to (2); 


€; = the projection of S on ker @ with respect to (3). 


Let x € Ay. Since x = (x + w(x)) — w(x), e3(x) = —w(x). From 
x = (x — €26(x)) + €68(x) we have €5(x) = €28(x). Next, let y © Ae. Then, 
since y = --¢,0w~'(y) + (€w-'(y) + y), we see that e4(y) = — €)Ow~'(y). 
Thus, for any x € Aj, €4 €3 €1(x) = €4 €3(x) = €4(—w(x)) = —€;0w7'(—w(x)) = 


€,0(x) and €se,(x) = €5(x) = €o8(x), whence (egege; + €5€1) (x) = (€, + €2)0(x) 
= 6(x). Evidently (e€se3€; + e€5€,) (ker 0) = 0 = @(ker 6). Therefore we obtain 
6 = €4€3€; + €5€1, as desired. 


LemMMA 6. Under the assumption of Lemma 5 if @ itself is an idempotent 
endomor phism of the left S-module S, then 6 is a sum of two non-singular endomor- 
phisms of the module. 


Proof. Assume that A; and w have still the same meaning as in the proof 
of Lemma 5, and also that A, = 6(S) without loss in generality. We consider 
the following mappings: 


a(x +y+2) = (x + "(y)) — w(x) + 2, 
a(x +y+2) = —w"(y) + (w(x) + y)+ 2, 
p(x ty+2)=—wy) +(x) — 5, 


p(x +y+2) = w"(y) — w(x) —Z 











602 YUZO UTUMI 


for x € Ay, y € Agand z € A;. Evidently these mappings are endomorphisms 
of the module S. It is easy to verify that oo’ = o’c¢ = 1 and pp’ = p’p = 1. 
Since (¢ + p(x +y +2) =x = O(x+y+2), we have 6=¢+ , as 
desired. 


LemMA 7. Under the assumption of Lemma 5, @ can be represented as a 
sum of non-singular endomorphisms of the left S-module S. 


Proof. From Lemma 5 it follows that @ = e«e3€, + €5¢€:. Now we note that 
each of the idempotents ¢€;, €3, €4, and 1 — ¢5 satisfies the assumption for @ in 
Lemma 5. Therefore these idempotents are represented as sums of non- 
singular endomorphisms. This implies that @ also can be represented as a sum 
of non-singular endomorphisms, completing the proof. 


THEOREM 2, Let S be a semisimple left self-injective ring with no ideals of 
index 1. Then S is generated by idempotents, and is also generated by non- 
singular elements. 


Proof. Letx € S. Then I(x) € 4(S) by Lemma 1, and hence /(x) = Se; 
and S = Se; @ Se;’ for some e, e;’ € S. Applying Lemma 4 we obtain a 
decomposition Se,’ = Sez ® Ses @ Ses such that Sez —~ Ses and m(Se4) < 2. 
Thus, S = Se; @ Sez @ Ses @ Ses. With no loss of generality we assume that 
€1, €2, €3, and e, are orthogonal idempotents and 1 = e; + e2 + e3 + &. 
Evidently, /(exx) > Se: @ Ses @Ses. However, if y © l(ex), then yeox = 0 


and yes € I(x) = Se, hence yes = yexe, = 0, whence 
y = y(er + €2 + es + es) = yer + yes + yes © Sex + Ses + Sex. 

Thus, 

l(exx) = Se, @ Ses @® Sea. 
Similarly, 

l(e3x) = Se, @ Sez ® Ses, 
and 

l(egx) = Se, @ Seo @ Sez. 
Denote by 7 the subring of S which is generated by all idempotents (non- 
singular elements) of S. Since /(e2x) contains a direct summand Se; isomorphic 
to Ses, Lemma 5 (Lemma 7) assures that exx € 7. Also, we have e3x © T in 
an analogous way. Next, we shall show that 

Se, + Ses + Ses (= Sle; a €2 a €3)) 


contains a left ideal isomorphic to Ses. In fact, if not, eg = 0 and m(Se,) = 1. 
Then, by virtue of Lemma 3, S(1 — (e: + e2 + €3)) (= Ses) contains a non- 
zero central idempotent c. Of course, m(Sc) = 1 since 0 + Sc C Sey. Hence 
by (Sl) Sc is an ideal of index 1, which contradicts our assumption. This 





——————— 





ii- 
ic 


in 


REGULAR AND SEMISIMPLE SELF INJECTIVE RINGS 603 


implies that Se; + Sez + Ses (= /(esx)) contains a left ideal A isomorphic to 
Se,. Since A € 4(S) it follows from Lemma 5 (Lemma 7) that eg € T. 
Therefore, 


x = Xe; + Xeo + Xe3 + XE, 7 
and S = T, which completes the proof. 


The following examples will illustrate that Theorem 2 does not hold for 
semisimple self-injective rings of index 1. 


Example 1. Every division ring D is, of course, semisimple self-injective. 
However, if D = GF (p) for every prime p, D is not generated by idempotents. 


Example 2. Let M be a set containing at least two elements. Then it is 
well known that the set of all subsets of M forms a complete Boolean algebra 
S. By virtue of (2, Corollary 4) the Boolean ring S is self-injective. However, 
if x is non-singular, then x = x(xx~') = x*x7' = xx7! = 1. Since 1 + 1 0, 


the subring 7 generated by all non-singular elements consists of 1 and 0. 


Thus, S + 7. 


4. Injectivity of continuous regular rings. 


LemMa 8. Let S be a left continuous regular ring, and S the maximal left 
quotient ring of S. Then every idempotent of S is contained in S. 


Proof. Let e be an idempotent of 8, and let A be the set of all idempotents 
in Se) S. By (Q3), § is a regular ring with unit 1 and 7%(8) is complete. 
Hence there exists the join U,.,Sf. Clearly Se D Up,aSf, and Se = (UpnsSf) @ 
Sg for some g € S. If Sg +0, then 8g \S+0 by (Q4), whence SgMS 
contains an idempotent f’ + 0. Since f’ € Sg (\S C Se S, we have f’ € A, 
and so f’ € (UpsSf) 8g = 0, a contradiction. Thus, Sg = 0 and Se 

UnaSf. On the other hand, by Lemma 2 there is an idempotent e’ € S such 


that D psSf C’ Se’. Evidently Se’ D UpnaSf = Se. Let Se’ = Se @ Sh. Then, 
Se’ = Be’ VSD (Se S) ® (SKN S) D(X Sf) © (8h S). 


SEA 
Since ¥ paSf C’ Se’, it follows that Sh (\ S = 0. Hence Sh = 0 by (Q4), and 
we see that Se = Se’. This shows that every principal left ideal of S is generated 
by an idempotent in S. In particular, 8(1 — e) = Se” for some idempotent 
e”’ € S. Let Se’ @ Se’ = Su, u being an idempotent. By (Q5) the unit 1 of 8 
is contained in S, and so SS = S. Hence we have 


Su = Se’ @Se” = 8e @S(1 — ec) = 8. 


This implies by (S3) that u is central and hence that u = 1. Thus, Se’ @ Se’ = 
S. Let x + y = 1, x € Se’, y € Se’. Since x € Se’ C Se’ = Se and y € Se” C 
Se’ = S(1 — e), we know that x = xe and ye = 0. Hence, e = (x + y)e = 


xe = x € Se’ C S. Therefore every idempotent e of S belongs to S, as desired. 











604 YUZO UTUMI 


THEOREM 3. Let S be a left continuous regular ring with no ideals of index |. 
Then S is left self-injective. 


Proof. Denote the maximal left quotient ring of S by 8. If § has an ideal 
A of index 1, then A (\ S +0 by (Q4). Hence i(A (\ S) = 1 and we have a 
contradiction to the assumption. Thus, § has no ideals of index 1. Since § 
is semisimple left self-injective by (Q3) it follows from Theorem 2 that 8 is 
generated by idempotents. Now, Lemma 8 assures that every idempotent of 
S§ is contained in S. Therefore, S = S and S also is left self-injective. 


COROLLARY. Every continuous regular ring with no ideals of index 1 is 
self-injective. 


The following example will illustrate the existence of continuous regular 
rings which are not left (right) self-injective. 


Example 3. Let {D.} be an infinite family of division rings D,, and let 
F.,, be a proper division subring of D, for each a. We denote by S the subring 
of the complete direct sum of D, consisting of all elements x such that all but 
a finite number of a-components of x belong to F,. Then S is a continuous 
regular ring. However, the minimal left self-injective (right self-injective, 
self-injective) extension ring of S is the complete direct sum of D,. 

5. Ideals of index 1. In connection with the assumption of Theorems 2 
and 3 it may be of interest to see that every left continuous regular ring S 
has the decomposition S = S,; @ 5S,’ such that S, is, if non-zero, an ideal of 
index 1 and S;’ is an ideal not containing any ideals of index 1. This is an 
immediate consequence of the following 


THEOREM 4. Let S be a semisimple I-ring. Then S has a maximal ideal S, 
of index < n. I(r(S,)) = Sy. 


Proof. Let S, be the sum of all ideals of index < n, and let a be a nilpotent 
element of /(r(S,)). If the index of a is m, /(r(S,)) contains a system {e,;} of 
total matrix units of degree m by virtue of (6, Theorem 1, p. 237). Assume 
that Se, (\ A = 0 for every ideal A of index < n. Then ASe C Sey, (\ A =0, 
and ASey, = 0, whence S,Se, = 0 and Sey C r(S,). Thus, (Sey)? C l(r(S,))r 
(S,) = 0 and so ey, = 0, a contradiction. Therefore, Se, (\ A +0 for some 
ideal A of index < n. By assumption Se; (\ A contains a nonzero idempotent 
fu’. Set fi; = eafu'e:;. Then it is easy to verify that {f,;} forms a system 
of total matrix units of degree m. Hence }>-f;;4; isa nilpotent element of index 
m. Since fu’ € A, we have f;; € A and D-fiui € A. Thus, m <n and so 
i(l(r(S,)) <n. Therefore, /(r(S,)) = S, and i(S,) <n. 


Coro.uary. Let S be a regular ring and suppose that Z(S) is complete. Then, 
for every positive integer n there is the decomposition S = S, @ S,' 
such that S, is an ideal of index < n and S,' is an ideal not containing any ideals 
of index < n. 











ar 








REGULAR AND SEMISIMPLE SELF INJECTIVE RINGS 605 


Proof. By Theorem 4 the maximal ideal S, of index < n is a left annihilator 
ideal. Hence, by Lemma 1, S, is a principal left ideal. Moreover, since S, 
is an ideal, (S3) assures that S, is generated by a central idempotent. Thus, 
S = S, @ S,' for some ideal S,’. If S,’ contains an ideal A of index < n, then 
A is also an ideal of S and hence is contained in S,. This implies that A C S, (1 
S,' = 0, completing the proof. 


THEOREM 5. Let S be a regular ring with unit, and let i(S) = 1. Then the 


following properties are equivalent: 


(i) S is continuous. 
(ii) The Boolean algebra B of idempotents of S is complete. 
(iii) The Boolean ring B of idempotents of S is self-injective. 


Proof. Every idempotent of S is central by (S2). Hence the set B of all 
idempotents of S forms a Boolean algebra by (R3). Moreover, every principal 
(left) ideal A is generated by a central idempotent e,. Thus, it is easy to see 
that the correspondence A — e, gives an isomorphism of “(S) (= (S)) and 
B. If S is continuous, then & (S) is complete and so is B. Conversely, if B is 
a complete Boolean algebra, B satisfies the infinite distributivity (1, p. 165), 
and hence B is upper continuous. Therefore, 7 (S) (=A (S)) is upper con- 
tinuous, and S is continuous. 

(ii) — (iii) follows directly from (2, Corollary 4). If we assume (iii), then 
B is the maximal quotient ring of B itself by (Q4). Hence B is complete by 
(2, Theorem 5), as desired. 


REFERENCES 

1. G. Birkhoff, Lattice theory, Amer. Math. Soc. Colloq. Publ., 25, rev. ed. (1948). 

2. B. Brainerd and J. Lambek, On the ring of quotients of a Boolean ring, Can. Math. Bull., 
2 (1959), 25-29 

3. H. Cartan and S. Eilenberg, Homological algebra (Princeton, 1956). 

4. B. Eckmann and A. Schopf, Ueber injektive Moduln, Archiv der Mathematik, 4 (1956), 75-78. 

5. G. D. Findlay and J. Lambek, A generalized ring of quotients, Can. Math. Bull., 7 (1958), 
77-85, 155-167. 

6. N. Jacobson, Structure of rings, Amer. Math. Soc. Colloq. Publ., 37 (1956). 

7. R. E. Johnson, The extended centralizer of a ring over a module, Proc. Amer. Math. Soc., 2 
(1951), 891-895. 

8. F. Maeda, Kontinuierliche Geometrien (Springer, 1958). 

9. J. von Neumann, Continuous geometry, Part | (Princeton, 1936). 

Y. Utumi, On quotient rings, Osaka Math. J., 8 (1956), 1-16. 

il. — A note on an inequality of Levitzki, Proc. Japan Acad., 33 (1957), 249-251. 

12. On a theorem on modular lattices, Proc. Japan Acad., 35 (1959), 16-21. 

13. K. G. Wolfson, An ideal theoretic characterization of the ring of all linear transformations, 
Amer. J. Math., 75 (1957), 358-386. 

14. E. T. Wong, On injective rings, Bull. Amer. Math. Soc., 63 (1957), 104. 

15. D. Zelinsky, Every linear transformation is a sum of non-singular ones, Proc. Amer. Math. 
Soc., 6 (1954), 627-630. 


O saka Women’s University and McGill University 











AUTOMORPHISMS OF FINITE LINEAR GROUPS 
ROBERT STEINBERG 


1. Introduction. By the methods used heretofore for the determination 
of the automorphisms of certain families of linear groups, for example, the 
(projective) unimodular, orthogonal, symplectic, and unitary groups (7, 8), 
it has been necessary to consider thé various families separately and to give 
many case-by-case discussions, especially when the underlying vector space 
has few elements, even though the final results are very much the same for 
all of the groups. The purpose of this article is to give a completely uniform 
treatment of this problem for all the known finite simple linear groups (listed 
in §2 below). Besides the “classical groups’’ mentioned above, these include 
the ‘‘exceptional groups,’’ considered over the complex field by Cartan and 
over an arbitrary field by Dickson, Chevalley, Hertzig, and the author (3, 4, 
5, 6, 10, 15). The automorphisms of the latter groups are given here for the 
first time. The unifying principles come from the theory of Lie algebras: each 
group is a group of automorphisms of a corresponding Lie algebra and this 
leads to structural properties shared by all of the groups. These centre around 
the so-called Bruhat decomposition (see (2) and 4.8 below), which, in case 
the underlying field is complex, reduces to the decomposition of the group into 
double cosets relative to a maximal solvable connected subgroup (see also (13) 
where much use is made of this decomposition). Stated roughly, the final 
result is that the outer automorphisms of these groups are generated by field 
automorphisms, graph automorphisms, which come from symmetries of the 
Schlaefli (or Coxeter) graph of the root structure of the corresponding Lie 
algebra, and diagonal automorphisms, a prototype of which is an automorphism 
of the unimodular group produced by conjugation by a (diagonal) matrix of 
determinant other than 1. Exact statements of these results (3.2 to 3.6 below) 
follow a description of the groups and automorphisms to be considered. 

An introduction to the standard Lie algebra terminology together with 
statements of the principal results in the classification of the simple Lie 
algebras over the complex field can be found in (4, pp. 15-19). (Proofs are 


available in (3: thesis), (9), (14), or (16).) 


2. The groups. Let us start with a Cartan decomposition of a simple Lie 
algebra over the complex field and denote by II and = respectively the sets 
of positive and fundamental roots relative to a fixed ordering of the additive 
group generated by the roots. Then, as in (4), one can replace the complex 
field by an arbitrary base field K after choosing a generating set {X,, X 


Received July 7, 1959. 
606 














AUTOMORPHISMS OF FINITE LINEAR GROUPS 607 


H,,r © 11} to fulfil the conditions of Theorem 1 of (4), and then define: 
x(k) = exp(ad RX,); ¥, = {x,(k),k € K}; U(B) is the group generated 
by those %, for which r is positive (negative); and finally G (denoted G’ in 
(4)) is the group generated by Ul and &%. The various groups G obtained in 
this way are A, (1 > 1), B; (l > 2), C; (1 > 3), and D, (l > 4), which are 
identified in (11) as suitable (projective) unimodular, orthogonal, symplectic, 
and orthogonal groups acting on spaces of / + 1, 2/ + 1, 2/ and 2/ dimensions 
respectively, as well as the exceptional groups Es, E;, Es, Fy and Gs. The 
groups G of this paragraph are called normal types. 

If the additive group generated by the roots admits an automorphism 
r — f of order 2 such that = = & and if the field K admits an automorphism 
k —k of order 2, one can define an automorphism ¢ of the normal type of 
group such that x,(k)” = xg(k) for all a € = or — 3, k € K, then restrict 
each of U and & to the subgroup of elements invariant under ¢, and finally 
restrict G to the group generated by these restrictions (15). In this way one 
gets subgroups of A, (/ > 2), D,, and Es which we denote A,' (unitary group 
in / + 1 dimensions), D,' (a second orthogonal group in 2/ dimensions), and 
E,' respectively. Similarly, automorphisms of order 3 yield a second sub- 
group D,? of D,. The groups of this paragraph are called twisted types and are 
also denoted generically by G. 


3. The automorphisms. Since each of the groups G above is centreless 
(to be proved in 4.4; actually with 5 exceptions the groups are all simple 
(4, 15)), we can identify G with its group of inner automorphisms. 

For each normal type let § (this is essentially © in (4)) denote the group 
of homomorphisms r — h(r) of the additive group generated by the roots into 
K*, the multiplicative group of K, with multiplication in § defined by 
(Ayhe)(r) = hy(r)he(r), and let § (this is H’ in (4)) denote the subgroup 
consisting of those homomorphisms which can be extended to the group of 
weights. Each h € § leads to an automorphism of the Lie algebra and then 
to one of G (also denoted A) such that: 


3.1 x,(k)” = x,(h(r)k) (r € lor —Tl,k € K). 
If G is a twisted type, then § is to be restricted to those elements which are 
self-conjugate in the sense that 4(7) = h(r) and § to those which have self- 
conjugate extensions to the group of weights. The elements of § considered 
as acting on G are called diagonal automorphisms. 

Each group G as a linear group admits field automorphisms induced by 
automorphisms of K (which must be restricted to commute with k > & in 
the twisted cases). 

Finally, symmetries of the corresponding graph lead to automorphisms of G. 
If r + # is an automorphism of the group generated by the roots such that 
op 


= , there exists an automorphism ¢ of G such that x,(k)* = x(k), a € = 


— 


or — 2, k € K (see (14, pp. 11-104) or (16, p. 94)). This yields extra auto- 











608 ROBERT STEINBERG 


morphisms of A, (/ > 2), D, and Eg (5 extra for D, and 1 extra for each other 
group). Also if K is perfect and of characteristic 3 and if G is of type Ge with 
fundamental roots a and 6 such that 2a + 36 is also a root, there is an auto- 
morphism o of G such that x,(k)” = x,(k), x(k)” = x_(k*), k © K, with 
similar equations for — @ and — bd. If K is perfect and of characteristic 2 
and if G is of type B, or Fy, a similar automorphism exists (13, Exposés 21 to 
24). The automorphisms of this paragraph as well as the identity are called 
graph automorphisms. Note that distinct graph automorphisms effect distinct 
permutations of the groups X,, a € =. 
Our aim is to prove first: 


3.2. If G is one of the groups defined in §2 and if G is finite, each auto- 
morphism oa of G can be written o = gfdi, with i, d, f, and g being inner, diagonal, 
field and graph automorphisms respectively. In this representation f and g are 
uniquely determined by co. 


Then denoting by G (this is G in (4)) the group of automorphisms of G 
generated by § and G, by A the group generated by G and the group of field 
automorphisms F, and by A the group of all automorphisms of G, and assuming 
that K has gq’, g?, or g elements in the respective cases that G is of type D,?, 
one of the other twisted types, or a normal type, we show: 


3.3. GOGCACA isa normal sequence for A. 


3.4. G/G is isomorphic to $/, hence is Abelian. Thus G = G for the groups 
Es, Fs, Ge and D?; G/G has order (1+ 1, gq—1), (2,¢—1), (2,¢ -—1), 
(4,g'— 1), (3,¢-—1), (2,¢-—1), @+1,¢4+1), (4,¢'4+ D), or (3,¢4+1) 
for the respective group A,, B,, C,, Di, Es, Ex, Ai’, D? or E,'; CG; G is cyclic 
with the sole exception: G of type D, (I even) and q odd. 


3.5. A/G is isomorphic to F, hence is cyclic if K is finite. 


3.6. The graph automorphisms form a system of coset representatives of A 
over A. Thus A = A with the exceptions: A/A has order 2 if G is A, (1 > 2), 


D, (l > 5) or Es, or if G is Bz or Fy and K has characteristic 2, or if G is G2 and 


K has characteristic 3; A/A is isomorphic to the symmetric group on 3 objects 
if Gis Dg. 


An immediate consequence of 3.3 to 3.6 is that each of the above groups 
which is simple verifies the Schreier conjecture (12, p. 303): if A is the auto- 
morphism group of a finite simple non-A belian group G, then A/G is solvable. 

Before starting the proofs of the above statements, we shall examine the 
groups under consideration a bit more closely. 


4. Structure of the groups. In this section G need not be finite. How- 
ever, until the last paragraph it is assumed that G is a normal type. Using the 
notation of §§2 and 3, one has: 








wW- 


AUTOMORPHISMS OF FINITE LINEAR GROUPS 609 
4.1. Each x € U can be written uniquely x = Ix,, x, © %,, the product being 
over the positive roots in increasing order. 


The proof of 4.1 as well as 4.2, 4.3, 4.5, 4.6, 4.7, 4.8, and 4.9 below can be 
found in (4). 


1.2. § is a subgroup of G, US is the normalizer of U and US (\B = 1. 
1.3. If K ts finite and has characteristic p, then VU is a p-Sylow subgroup of G. 
4.4. The centre of G is 1. 


Proof. If x is in the centre of G, then x € UG by 4.2. Similarly x € BH, 
whence x © § = UH) BH by 4.2. But then 3.1 with x = h yields hA(r) = 1 
for each r € II, whence x = A = 1. 

Let W denote the Weyl group and w, the reflection in W corresponding to 
the root r. One has: 

$.5. For each w€ W there is w(w) € G such that w(w)x,(k)w(w)' = 
Xwr(nk), r © IL or — Il, with n = + 1 depending on w and r but not on k. 

4.6. The union of the sets Hw(w) is a group BW and the map w — Hw(w) 
is an isomorphism of W on %/®. 

Next for each w € W define U, = UC) w(w)-'Bw(w), dU,’ = UC w(w)-' 
llw(w) so that U, (U,’) is the group generated by those %, for which r > 0 
and wr < 0 (wr > 0). Thus if a € = and w = w, one has ll, = X. 


7. u = lL, Ul,” = a 


4.8. The sets USw(w)U,, w € W, are the distinct double cosets of G relative 
to US, and euch element of G has a unique expression of the indicated form. 


Analogous results hold with U replaced by &. 


4.9. For each r € Il there is a homomorphism of SL2(K), the unimodular 
group, onto G,, the group generated by X, and X_, such that 


1 *) ey 
(2 , = x,(k), 


x_,(k), 


( 0 ‘) = w(w,) mod 
= Ww ( « 
7) 1 0 w(W, D, 


and @'( (\G,) consists of the diagonal matrices. The kernel of o is contained 
in the centre of SL2(K). 


& 
, i. 
ee 
Nea 
ll 


We may (and do) normalize so that 


( 0 ys a 
A. wv “ ™ 











610 ROBERT STEINBERG 


for each a € = and then define each w(w) to be a product of elements w(w,), 
a € &. Then 4.9 implies: 


4.10. For each a€ = the equation x,(1)xq(k)xg(1) = x_q(k)xq(1)x__(k) 


holds only if k = — 1 and then both sides are equal to w(w,). Given k, 1 € K*, 
a€ &, then xq(k)x_q(l)xq(m)x_,(t) is in Hw(w,) only if m = —l"' and t = 
L + Ri?. 


4.11. If w= w,, a € &, then (1) T = union BH, VHw(w)B, is a group 
and (2) T(\U = Uy. 


Proof. By 4.8, T is closed under right (or left) multiplication by B®. 
A consequence of 4.9 (see (4, p. 34, Lemma 2)) is w(w)¥,w(w)-' C union 
BH, Bu. Hw(w)B,. Thus Tw(w)-' C T, hence TT—' C T, and T is a group. 
Next write 8 = %,&%,’. Then w(w)%,w(w)-' = U, and w(w)S,'w(w)' = 
%.’ by 4.5 so that T = w(w)Tw(w)-' = union U,%,’5, UB,’ So(w)ll,. 
Now 1,8.’ ©. U= U, by 4.2; and if wo € W is defined by well = —T1, then 
UGSw(wow) O\’1 w(wo)U = 0 by 4.8, when left multiplication by w(w»)-! yields 
VHo(w) \U =0 and then U,%,’ Ho(w)l, OU = 0. Thus TOU = Uy. 


4.12. Among the double cosets BHw(w)B, for which w + 1 and (1) of 4.11 
holds, those for which w has the form w = wy, a € 2, are characterized by the 
fact that T (\ Ul is minimal. 


Proof. If wdoes not have the form w = w,,a € 2, then T D w(w)B,w(w) 
= U,-: D %, for some 6 € &, the last inclusion being proper; thus 7 /)\ U 
is not minimal by 4.11. If a, € =,a + 6, then &, D %,; hence 7 (\ U with 
W = W,, a € &, is minimal by 4.11. 

Because of the results of (15), the twisted types of groups have corres- 


ponding properties whose proofs are entirely analogous to those given above 
and in (4). 


5. Proof of 3.2 for the normal types. Throughout this section and the 
next assume that G is a normal type. The method of proof is as follows: we 
start with an arbitrary automorphism o of G and multiply in turn by an inner, 
a diagonal, a graph and a field automorphism, referring at each stage to a 
normalization of o; the final normalization yields ¢ = 1, whence 3.2 soon 
follows. Only in the first step is the finiteness of K used. Hence the rest of the 
argument is phrased so as to be applicable even if K is infinite. 


5.1. If K is finite, the automorphism o can be normalized by an inner auto- 
morphism of G so that U" = U and B = &. If this is done, then $*° = § and 
there is a permutation p of the fundamental roots such that ¥,° = X5_, and ¥_,” = 
X_»2 for each fundamental root a. 


Proof. By 4.3, U, B, Ul’, and B are all p-Sylow subgroups of G, hence are 
conjugate. Thus one can normalize ¢ by an inner automorphism to fulfil 
ure = Uu. Now &B = x“'llx for some x in G; Thus B& = u—'w(w)—'Uw(w)u 


or gy 











AUTOMORPHISMS OF FINITE LINEAR GROUPS 611 


with wu € U, w € W, by 4.8 and 4.2. Since 8 (\ U = 1 by 4.2, one has Be MU 
= 1, whence w = wp (defined by woll = —TII) by 4.5, and then B = u-'Bu 
by 4.5. A second normalization, the inner automorphism effected by u, now 
yields ¥" = B. Then UGH and BH are invariant under o@ by 4.2, and so is §, 
since © = UST\ BVH by 4.2. The double cosets of G relative to BH (UH) 
are thus permuted by ¢, and by 4.12 there exists a permutation p (a per- 
mutation r) of the fundamental roots (of their negatives) such that %,° = 
X,, and ¥_,” = X¥_4) for each a € &. If 6 and ¢ are in = and 6+ ¢, then 
b +(—c) is not a root, hence X¥,X_, = 0 (in the Lie algebra) and ¥, commutes 
with ¥_.; if b = c, then ¥, does not commute with ¥_, by 4.9. Setting 6 = pa, 
¢ = —r(—a), one concludes pa = —r(—a). Thus 5.1 is proved. 


5.2. The normalization of o attained in 5.1 can be refined by application of a 
diagonal automorphism of G so that in addition x,(1)’ = Xyq (1) for each funda- 
mental root a. It is then true that x_,(1)”° = x_,,(1), w(w,)” = w(w,,.), and the 
orders of W,W, and W,gW,, are equal for any fundamental roots a and b. 


Proof. Let xq(1)” = Xpa(Ra), @ © Z. Then there exists a homomorphism h/ 

of the additive group generated by the roots into K* such that h(a) =k, 
a © &. Application of the corresponding diagonal automorphism now yields 
the refinement x,(1)” = x,_(1), a € 2, by 3.1. Applying o to the first equation 
of 4.10, one then gets x_,(—1)” = x_,4(—1) (so that x_,(1)" = x_,,(1)) and 
w(w,)” = w(w,,) for each a € &. Lastly, the orders of w,w, and w,,w,, are 
respectively equal to the orders of w(w,)w(w,) and w(w,.)w(w,,) mod § by 
4.6, hence are equal to each other because ¢ is an automorphism. 

The last conclusion can be interpreted geometrically. If the order of w,w, 
is n, then the angle between a and 6 is  — x/n. Thus p effects an angle pre- 
serving permutation of the fundamental roots, hence is the identity unless 
the corresponding graph has extra “‘angular’’ symmetries (see (3, p. 18)). 


5.3. The normalization of o in 5.2 can be refined by application of a graph 
automorphism of G so that p is the identity. 


Proof. Suppose first that G is of type F, or Bz and that p is not the identity. 
Then there are a, 6 € = such that a + bd and a + 26 are roots, pa = 6} and 
pb = a, by the remarks above. Let a and 8 be the maps of K defined by 
X_(k)° = x,(R*) and x,(l)? = x,(l%). One has xq4,(1) = w(we)x,(l)w(w,)—' and 
Xare (Rk) = w(W,)x_(k)w(w,)-! by 4.5 (in which the normalization 7 = 1 is 
achieved by replacing X,4, or X,42 by its negative if necessary). Applying 
a to these equations, one gets Xg4»(/)” = Xere(7) and Xg420(k)” = Xo49(k*). 
Consider the commutator equation 


5.4. (%q(R), Xo (L)) = Xern(5RL) X a4 2 (€RI*), 


with each of 6 and ¢ equal to + 1 and independent of k and / by (4, p. 27, 
ll. 22-26). Apply o to 5.4: 


5.5. (xy (R*), Xa(V?)) = X.420( (ORL)? ) 040 ((eRI*)*). 











612 ROBERT STEINBERG 


Now let us replace k and / by F and k* respectively in 5.4 and take inverses: 


5.6. (xp (R*), xg (UF)) = Xaren(— €l®(R*)*)X—40(— SFR). 

Comparing the %,,2, components of 5.5 and 5.6 with / = 1, one gets 6k*® = 
— e(k*)? by 4.1 and 5.2. Setting first k = 1 and then k = —1, one gets 
6 = — eand — 6 = — e, whence 6 + 6 = 0, so that K is of characteristic 2. 


Then since 8 is onto, k® = (k*)? implies that K is perfect. Hence (see the 
paragraph preceding 3.2) there exists a graph automorphism which normalizes 
o so that pa = a and pb = Bb. As is easily seen, p is now the identity. If G is 
of type Ge, one proceeds similarly and finds that a normalization is not re- 
quired unless K is perfect and of characteristic 3. If G is of type A,, D), or 
E, and K is arbitrary, then again there is a graph automorphism to normalize 
p to the identity. In all other cases, due to the lack of ‘“‘angular’’ symmetry 
of the graph, p is already the identity. Thus 5.3 is proved. 


5.7. The normalization of o in 5.3 can be refined to ¢ = 1 by application of 
a field automorphism of G. 

That is, if o satisfies Ue = U, Be = B, and x,(1)" = x,(1) for each a € &, 
then o is a field automorphism. 


Proof. Choose a € = and define a by x,(k)’ = x,(k*). We first show that 
a is an automorphism of K by the method of Schreier and van der Waerden 
(12, p. 318). By 5.2 we know that a maps K onto K and that 1* = 1. The 
equation x(k + 1) = xq(k)x,(l) implies that (k + ])* = k* + It. The equation 
4.5 with r = a and w = wu, yields x_,(k)” = x_,(k*), and then the second part 
of 4.10 implies (/ + k/?)* = I= + k«(i*)?, whence (k/?)* = k«(/*)? and (/?)* = 
(/*)?. If K is of characteristic 2, then ((R/)*)? = ((Rl)*)* = (k2l?)* = (k?)#(I*)? 
= (k*)?(i*)? = (kl*)?, whence (k/)* = kl"; if K is of characteristic other than 
2, then polarization of the equation (/*)* = (/*)? yields (k/)* = k«/*. Thus 
in either case a is an automorphism of K. Now choose a second root 6 € = 
(if one exists) such that @ + 6 is a root, and let 8 be the corresponding auto- 
morphism of K. By labelling appropriately a and 6 one may assume that 
w(W,)X,w(wa)' = Xa4n. Then applying o to the equation (x,(k),x,(1)) = 
Xai (6k)... as in 5.3 (but now with p the identity), one gets k® = k*, whence 
a = 8. Since any 2 roots of = are the end terms of a sequence of roots of = 
such that the sum of each consecutive pair is a root (in other words, the graph 
is connected), it follows that there is a single automorphism y of K such that 
x,(k)* = x,(k7) for each c in . Normalization of ¢ by the field automorphism 
of G corresponding to y~' now yields x,(k)* = x,.(k),c € =. One has also 


w(w,)” = w(w,),c € 2, by 5.3. However, since W is generated by the elements 
w.,c¢ € =, and each root has the form we with w € W, c € &, it follows from 


4.5 that G is generated by the elements x,(k) and w(w,) with k © K,c © &. 
Hence o = 1, and 5.7 is proved. 
Let us now prove 3.2. Let o be an automorphism of G and let i, d, g, and f’ 


be the respective inner, diagonal, graph, and field automorphisms used in 




















AUTOMORPHISMS OF FINITE LINEAR GROUPS 613 


5.1, 5.2, 5.3, and 5.7 to achieve the normalization of ¢~'. One has f’gdio~' = 1, 
thus o = f’gdi. Since g~'f’g = f is in F by 5.7, one gets o = gfdi, and the 
first statement of 3.2 is proved. Now suppose ¢ = g;f;d;i; is a second repre- 
sentation of o in the indicated form. Then d-'f~'g~'g,fid,; = di,;-'. The left 
side of this equation maps U onto U and % onto B. Hence ii;-' € US 11 BH = 
© by 4.2. Then f~'g~'gif; = dii;-'d,-' € §. This element leaves fixed each 
x4(1), a € 2, hence dii,;—'d,;-' = 1 by 3.1; that is, g~'g,; = f/,-'. This implies 
that g and g;, effect the same permutation of the groups %,,a € =. Hence 
g = g:, then f = f,, and 3.2 is proved completely. 


6. Proof of 3.3 to 3.6 for the normal types. The group G of inner 
automorphisms is clearly a normal subgroup of each of G, A, and A. This 
implies G= SG. One has also § (\G = © since § C.. S$ A\G by 4.2, whereas 
h€ SC\@ implies WU’ = U, B* = B and then h € USM BVH = G by 4.2. 
Thus G/G = $/(S AG) = $/G. The specific results of 3.4 can now be 
verified from the fact that the Cartan integers 2(a, 6)/(a,a) (a,b € 3), 
taken mod (g — 1), build a relation matrix for $/© (4, p. 48, Il. 13-18). 
It is easily verified (from the definitions) that fGf-' = § for each f in F. 
Hence G is normal in A and A = FSG, whence the uniqueness feature of 3.2 
implies 3.5. Finally, if g is a graph automorphism, one verifies (by considering 
the effect on each x,(k)) that gFg-! = F and gSg-' = §, whence A is normal 
in A, and 3.3 is completely proved. The uniqueness feature of 3.2 then implies 
the first statement of 3.6; the last statement follows from the definition of 
graph automorphism given in the paragraph before 3.2. 


7. The twisted types. The proofs of 3.2 to 3.6 for the twisted types are 
virtually the same as those given above for the normal types and to a large 
extent involve little more than a change of notation in view of the structural 
properties developed for the twisted types in (15). A comparison of 4.1 and 
5.4 with their analogues 4.5 and 8.8 in (15) should make completely clear 
what modifications are to be made, if G is not A,' (/ even), and even in the 
latter case if / > 4. This leaves the group A,' to be considered. Although the 
proofs in this case are also of the same genre as those given above for the 
normal types, the details are sufficiently more complicated to warrant a 
separate exposition, especially since the case in which K has few elements has 
not been completely treated elsewhere. 

Let us recall that A,' is a subgroup of A» and may be identified with a 
3-dimensional projective unimodular unitary group (15). The positive roots 
of A» can be written as a, d, and b (with 6 = a + 4), and then the elements 
of U take the form x,(k)xa(k)x,(l), subject to kk = 1 + 1 (15, Lemma 4.6). 
For given k this last equation is always solvable for /: choose m so that m + 
m+0, and then set 1 = kkm(m+m)-'. For convenience, we denote 
Xq(k)x=(k) x» (1) by (k\l), so that the rule of multiplication is (R{/)(m|\n) = 
(k + m|l + n + km) (see (4, p. 27, ll. 22-26) or use the unitary identifica- 











614 ROBERT STEINBERG 


tion). From this it follows that the elements (0\¢), subject to ¢ + i = 0, 
build the centre € of Ul. Let us now turn to the proof of 3.2. 

If ¢ is an automorphism of G, it can be normalized by an inner automor- 
phism, just as before, so that Uv = U and B = B; assume this is done. Then 
G = &, and since (R{/,)—'(R\/.) € ©, the map a@ defined by (R{l)" = (k*|m) 
is single-valued; clearly a is also onto. 

The normalization of ¢ can now be refined by application of a diagonal 
automorphism of G so that 1* = 1, and the next thing to be proved is that a 
becomes an automorphism of K. If K has 4 elements, then a@ leaves fixed 
0 and 1 and permutes the other 2 elements of K, hence is an automorphism. 
Thus in the rest of the proof we may assume that K has more than 4 elements. 
Because ¢o is an automorphism we have (k + /])* = k* + 2; k, 1 © K. Next 
if h € and h(a) = k, we have hA(/|*)h-' = (Ri\*), whence h(/*|*)(h7)—' = 
((kl)*|*). Setting 1 = 1, we get h’(a) = k*, and then from the equation, 
(kl)* = kel*. Here / is arbitrary, but & is restricted to the set S of numbers of 
the form m?m-! (see the definition of 5 in §3). The field Ko of numbers left 
fixed by k > k is contained in S, and this inclusion is proper: if K is of character- 
istic 3 and m¢ Ko, then m*m-' ¢ Ko; if K is of characteristic other than 3 
and k¢ Ko, r € Ko, r+0,1, then not all three of m, = k, mz = 1+ k, 
m; = r + k can have cubes in Ko because, in the contrary case, differencing 
yields k + k*, rk +k? © Ko and then k € Ko, a contradiction, and so 
mm! ¢ Ko for some i = 1, 2, or 3. There is thus k € S such that k ¢ Ko, 
and each element of K can be written as rk + s (r,s € Ko), that is, as the 
sum of 2 elements of S. Since a@ is additive on K and multiplicative on S, this 
implies that @ is multiplicative on all of K and hence is an automorphism. 

Thus the normalization of ¢ can be refined by a field automorphism of G 
so that @ is the identity, and what remains to be shown is that now o = 1. 
Choose k¢ Ko, and set 7 =k—&k. Then o applied to ((1|*), (k|*)) = 
(O|k — k) = (O|j) yields (O\j)" = (O|j), that is, o leaves fixed x,(j). A 
slight extension of the first statement in 4.10 shows that x_,(—j-') and 
X»(j)x-»(—j~")x»(j) are also left fixed by ¢, and that the latter elemeat is in 
w(w)$ and so may be denoted w(w) after a normalization. A final calculation 
shows that for given (R|/), 1 = 0, one has (k\/)w(w)(m|n) € BH if and only 
if m = jkl-' and n = jjl-'. If this condition is met, if (k\l)* = (k\l;) and if 
(m|n)* = (m|\n,), then application of « yields m = jkl,-', whence /,; = / and 
(R\l)* = (Ril). Thus ¢ = 1 on U. Since w(w)’ = w(w) and w(w)Ulw(w)' = &, 
we get o = 1. The first statement of 3.2 is hereby proved, and the other 
statement as well as 3.3 to 3.6 follow from it, just as before. 


8. Final observations. As we have already stated, the finiteness of K is 
used above only in the proof that an automorphism of G necessarily maps U 
onto one of its conjugates. If K is algebraically closed, this fact is proved in 
(13) by rather advanced methods of topology and algebraic geometry. It is 
hoped that an elementary proof, along the lines of the present article, can be 





ee 




















—$— ——— 





AUTOMORPHISMS OF FINITE LINEAR GROUPS 615 


found to handle all fields K simultaneously. In regard to (13), we also mention 
that the proof of existence of graph automorphisms for By, Fy, and Gz is quite 
long and that a shortened self-contained treatment is desirable. 

An interesting special case of such an automorphism occurs when G is 
B,(2), the group of type By, over a field K of 2 elements. This group is iso- 
morphic to Ss, the only symmetric group which admits outer automorphisms: 
one of these shows up as a graph automorphism which owes its existence to 
the fact that K has characteristic 2. It is also interesting to compare the 
groups A 2(4) and 4;(2). For the first the order of A/G is 12 and for the second 
it is 2. Thus one has another proof of the well-known fact that these groups, 
both of order 20160, are not isomorphic. 

Finally let us remark that a companion problem to that treated here, 
namely the determination of the isomorphisms among the various finite 
groups of §2 is handled in (1) by uniform number-theoretic methods. In this 
connection we also refer the reader to 12.5 of (15) which can be used to 
eliminate some of the computations of (1). 


REFERENCES 


. E. Artin, Orders of classical simple groups, Comm. Pure Appl. Math., 8 (1955), 455. 

F. Bruhat, Représentations induites des groupes de Lie semi-simples connexes, C. R. Acad. 
Sci. Paris, 238 (1954), 437. 

E. Cartan, Oeuvres completes (Paris, 1952). 

>. Chevalley, Sur certains groupes simples, TOhoku Math. J., 7 (1955), 14. 

.. E. Dickson, Linear groups (Leipzig, 1901). 

—— A new system of simple groups, Math. Ann., 60 (1905), 137. 


a 


a al alos 
cro 


J. Dieudonné, On the automorphisms of the classical groups, Mem. Amer. Math. Soc., 2 
(1951). 
8. ——— La géométrie des groupes classiques, Ergeb. der Math. u. i. Grenz. (1955). 


9. E. B. Dynkin, The structure of semi-simple algebras, Amer. Math. Soc. Translation No. 17. 

10. D. Hertzig, On simple algebraic groups, Short communications, Int. Congress of Math. 
(Edinburgh, 1958). 

11. R. Ree, On some simple groups defined by C. Chevalley, Trans. Amer. Math. Soc., 84 (1957), 
392. 

12. O. Schreier and B. L. van der Waerden, Die Automorphismen der projektiven Gruppen, 
Abh. Math. Sem. Univ. Hamburg, 6 (1928), 303. 

13. Séminaire C. Chevalley, Classification des Groupes de Lie Algébriques (Paris, 1956-8). 

14. Séminaire “‘Sophus Lie”’ (Paris, 1954-1955). 

15. R. Steinberg, Variations on a theme of Chevalley, Pac. J. Math., 9 (1959), 875. 

16. H. Weyl, The structure and representations of continuous groups, |. A. S. notes (Princeton, 
1934-1935). 


University of California, Los Angeles 











INVARIANTS OF FINITE REFLECTION GROUPS 
ROBERT STEINBERG 


Let us define a reflection to be a unitary transformation, other than the 
identity, which leaves fixed, pointwise, a (reflecting) hyperplane, that is, a 
subspace of deficiency 1, and a reflection group to be a group generated by 
reflections. Chevalley (1) (and also Coxeter (2) together with Shephard and 
Todd (4)) has shown that a reflection group G, acting on a space of m dimen- 
sions, possesses a set of m algebraically independent (polynomial) invariants 
which form a polynomial basis for the set of all invariants of G. Our aim here 
is to prove: 


THEOREM. Let G be a finite reflection group, acting on a space V of finite 
dimension. Let J be the Jacobian (matrix) of a basic set of invariants of G, com- 
puted relative to any basis of V. Let p be any point of V. Then the following 
numbers are equal: 

(a) the maximum number of linearly independent reflecting hyperplanes 
containing p; 

(b) the maximum rank of 1 — x for all x in G for which xp = p; 

(c) the nullity of J at p. 


The equality of the numbers defined in (b) and (c) is the essence of a 
conjecture of Shephard (3). 

Throughout the paper, G is a reflection group, of finite order g, acting on a 
space V of m dimensions. The symbols Z,,..., ZL, denote the hyperplanes in 
which reflections of G take place, as well as non-zero linear forms which vanish 
on the corresponding hyperplanes, and for each i, a; is a corresponding non- 
zero normal vector, 7; is the order of the (cyclic) subgroup of G which leaves 
L, fixed pointwise, and R, is a generator of this subgroup. Finally, J,,..., J, 
are basic invariants of G; d;,...,d, are their degrees; and J generically) 
denotes their Jacobian, relative to whatever basis is at hand. 


LEMMA. For some non-zero scalar c, 
0 
i-1 
det J=c [] Li. 
i=1 
A proof of this well-known result will be included because it and the corollary 
below play a key role in the proof of the theorem. Choose an orthonormal 
basis of V so that the first co-ordinate x, is a multiple of Z,. If J is any in- 
variant of G, the equation R,J = J implies that J is a polynomial in x,", 
whence 
xi! ' divides I /dxy. 
Received July 15, 1959. 
616 





| 


———————————— 








Si 


pr 
in 
ra 


di 


by 


SC 














FINITE REFLECTION GROUPS 617 


Thus the first row of J, and hence also det J, is divisible by 
ri ri—1 


a" 
x; , and hence also by Lj 


Similarly, det J is divisible by each L'*—'. Using the formula 


D> (d,;- 1) = > (r~—1), 


proved in (4, p. 290, 1. 12), a comparison of degrees shows that the factor c 
in the statement of the lemma is a scalar, non-zero because the J, are algeb- 
raically independent. 

From the first part of the proof we have: 


CoROLLARY. The determinant of the Jacobian of any n invariants of G is 
divisible by [[z. 


Proof of the theorem. If k,l, and m denote the respective numbers defined 
by (a), (b), and (c), we prove in turn that m < k, k <1, andl < m. 

First label the L’s so that Z,,..., LZ, are those which contain p, and then 
choose an orthonormal basis f:,...,), of V so that pi,...,p, span the 
same subspace as d@;,... , @,, the normals to the L’s. Let G’ be the (reflection) 
group generated by R,,..., R,. The co-ordinates xy4; = Ip4i',...,%, = J,’ 
are invariants of G’. If J;’,..., J,’ are any invariants of G, they are also 
invariants of G’, and the corollary above shows that 

I ui 
1 
divides 
OUR, . «« » Hep /OGn «+ Mads 
that is, divides 
O(Ti,..., Tp) /O(xs, ..., Xe). 
Consider now the expansion of det J across the first k rows: 
det J= >) + J'(i;,..., ts) J (iner, «+» te) 
with J’(i;, ..., ix) denoting the minor corresponding to the rows 1,... , & and 
columns i;,..., a of J, J’ (ies1,..., %) denoting the minor corresponding to 
the rows k + 1,...,m and columns i4;,...,%, and the sum being over all 
permutations i;,...,%, of 1,...,m for which 4; <<... << and i%j4); <... 
< i,. By what has just been shown, each J’ is divisible by 
u 


a7”, 


1 


so that, by the lemma, there are polynomials M(i,,..., a) such that 


I] Li = > M(is,... ts) Saat, .. - te). 


u+1 











618 ROBERT STEINBERG 


Since the left side of this equation is not 0 at p, we conclude that some J” 
is not 0 at p, whence J has rank nm — k at least and nullity k at most at . 
Thus m < k. 

Next, assume that the labelling is such that L;,..., LZ, contain p and are 
linearly independent. Set x = R,R2... Ry. Suppose xq = g, with ¢g € V 
Then R,'¢g = Rz... Ryg implies that 


Q + C10, = 9 + Coa +... + Cay 


for suitable scalars c,, whence, because of the linear independence of the a,, 
we conclude that c,;=0 and Rig =q. Similarly Rog = q,..., Rig = 4, 
hence g lies in each of Z;,..., Z,, and the solution space of the equation 
xq = q has dimension m — k. Thus 1 — x has rank , and the inequality 
k < l has been established. 

Finally choose x € G so that 1 — x has rank / and xp = p, and then an 


orthonormal basis fi, ..., P, of V so that xp, = cp, withe, + lforl <j <1 
and c, = 1 for! +1 <j <n. If J is an invariant of G, the equation xJ = 
implies that each term of J has a total exponent in the co-ordinates x;,... , x; 


which is either 0 or at least 2. Thus for each 7 such that 1 <j < /, aI /dx, 
is 0 at any point at which x;,..., x, are all 0, in particular, at p. This implies 
that the first / rows of J vanish at p, whence / < m. 

Thus the theorem is completely proved. 


REFERENCES 


1. C. Chevalley, Invariants of finite groups generated by reflections, Amer. J. Math., 77 (1955), 


778. 
2. H. S. M. Coxeter, The product of the generators of a finite group generated by reflections, Duke 
Math. J., 18 (1951), 765. 
3. G. C. Shephard, Some problems of finite reflection groups, Enseignement Math., JJ (1956), 42. 
4. G. C. Shephard and J. A. Todd, Finite unitary reflection groups, Can. J. Math., 6 (1954), 274. 


University of California, Los Angeles 





Juke 


A METRICAL THEOREM IN DIOPHANTINE 
APPROXIMATION 


WOLFGANG SCHMIDT 


Introduction. In this paper we prove a sharpening and generalization 
of the following Theorem of Khintchine (4): 


Eat bile); « 2 os ¥n(q) be n non-negative functions of the positive integer q and 
assume 


v(q) = I] ¥:(q) 


is monotonically decreasing. Then the set of inequalities 


(1) 0 S gO; — pi < ¥i(Q) (¢ = 1,...,) 

has an infinity of integer solutions q > 0 and p,,..., bn for almost all or no 

sets of numbers 0,,...,9,, according as > v(9) diverges or converges. 
Actually, Khintchine proved the Theorem with |g@; — p,| < ¥;(g) instead 


of (1). The first author who used the one-sided inequalities (1) was Cassels (1). 
Surprisingly, the following sharpening of the Theorem seems to have 
escaped attention. 


THEOREM |. Make the same assumptions as in Khintchine’s Theorem. Let «> 0 
be arbitrary. Write N(h; ,...,0,) for the number of solutions of (1) with 
1SqsS hand put 


h 


(2 v(h) = > vq) 
i 
(3) Q(h) = > ¥(g)q™* 
Then 
(4) N(h; @,..., 0) = W(h) + O(W(A) QA) log?** W(h 
for almost all sets 0, ... , 9. 


Note. In this paper, log a stands as an abbreviatson for 
flogarithm a,ifaZze 
U1, ifa <e. 
Only log(1+ 1(1/qg—1)) in (10) means logarithm, in spite of 1+ (1/q¢—1) <e. 
Next, we generalize Khintchine’s Theorem to linear forms. We use the 
following notation. Throughout this paper, lower case italics denote rational 
Received May 4, 1959. 


619 








620 WOLFGANG SCHMIDT 


integers. By Q, R,..., we denote lattice points Q(q:,...,¢m) in Rn. @ 


denotes points (6;,...,4m) in Rn. pQ, where p is real, is the point with O 
co-ordinates pqi,..., pdm, and QO is the scalar product 9,0; + ... + GnOm. 
We write d(Q) for the number of common divisors of q:,..., Gm. Finally, we 
put Q Sh if g = max(q:,...,¢m) Sh, and similarly h < Q. w 
fo 
THEOREM 2. Let ¢€ > 0 be arbitrary. Let ¥:(Q),...,W%n(Q) be n bounded 
non-negative functions. We introduce ” 
¥(Q) = I] ¥.(Q) si 
i=] ' g. 
Wh) = D v¥(Q) a 
QSh 
' TI 
x(h) = ¥(Q)d(Q) 

QSh de 
and write N(h;®,,..-.,0,) for the number of simultaneous solutions Q < h, mn 
Dates P, of the system 4 

’ r t 
(5) 0 S QO; — p: < ¥i(Q) is @ 3... . 5h a 
Then for almost all n-tuples ®;,..., 6, } 
(6) N(h; @,,..., On) = V(h) + O(x! (A)log*/2+*(h)). Zt 
l 
Note. We need not assume ¥(Q) to be monotonic in any co-ordinate. de 
This theorem can be interpreted as a generalization of the well-known fact |  w/( 
that the points (Q@;,...,Q0,) are uniformly distributed mod 1 for almost w( 
all @;,..., 0,. (See, for instance, (3, chapter 1v).) Indeed, putting ¥,(Q) =a, 
a = Ila;, we have V(h) = ah™ and int 
) 
$O (h log h), ifm = 1 | 
on d(O) =- re 
x(h) = @ » (Q) LO(h"), if m > 1. | wh 
An interesting special case of Theorem 1 is when ¥(Q) = ¥(q¢), where sim 
gq = max(qi,..-., 9m). Then (It 
int 


xi) = AS » d --: > via.) | 


Thus we have 
x(h) = O(¥(hA)) 


if m = 3, or if m = 2 and g¥(q) is monotonically decreasing, because in the | 


latter case | wil 
} 


D vai): < dW (h). 


@iSh 
d\@ 








nN). 


act 
ost 


Qty 


lere 


the 


—EEE 


A 





ON DIOPHANTINE APPROXIMATION 621 


For example, if ¥.(Q) = ¥i(g) = g-”"", ¥V(Q) = go, V(h) = mlogh + 
O(1), then for almost all 6,,..., 9, 


N(h; @1,..., Om) = mlog h + O (logth loglog***h), 


where we may take a = 2 for m = 1, according to Theorem 1, and a = 3/2 
for m > 1, according to Theorem 2. 

For the proof we have to modify the standard proof of Khintchine’s Theorem 
and use some ideas of (2). The new idea in Theorem 1 is to use fractions p/q 
with g.c.d.(p,g) Sk where k is specified later, instead of p/g with 
g.c.d.(p, gq) = 1, as employed in (1; 3; 4). Theorems 1 and 2 should be com- 
pared with similar results | proved recently in the geometry of numbers (5). 

We give a detailed proof of Theorem 1 only. For convergent sums 7 (q) 
Theorem 1 follows from Khintchine’s Theorem. Hence in §§ 1 to 4, which 
deal with Theorem 1, we assume without explicit mention that ¥(q¢) is a 
non-negative, monotonically decreasing function with divergent sum > v(q). 
V(h) and Q(h) are defined by (2) and (3). The author is much indebted to 
the referee who discovered a mistake in the original draft and made valuable 
suggestions. 


1. On certain intervals. Let w(h), h 2 1, be a monotonically increasing 
integral-valued function which tends to infinity. We write w(0) = 0 and 
define S’ to be the set consisting of 0 and of all integers 4 > 0 such that 
w(h — 1) < w(h). We define S” to be the set of integers h 2 0 having 
w(h) < w(h + 1). Finally, S is the set of values of w(hk), h = 0. 

Next, we define for fixed ¢ > 0 intervals of order ¢ to be the half-open 
intervals 

(u2* + v,, (uw + 1)2' + vel, 
where u, 01, ve are non-negative integers such that v; < 2‘ and 2, v2 are the 
smallest non-negative integers satisfying w2‘'+ 0, € S, (u+1)2‘'+€ S. 
(It is possible, of course, that for given u,?/ there exists no such »,.) The 


intervals of order ¢ cover the positive axis exactly once. 


LEMMA 1. Every interval (0, x], x € S, can be expressed as union of intervals 
UI, of the type described above, where no two of the intervals I, are of the same 
order. 


Proof. Write x in the binary scale, 


where ¢,; equals 0 or 1, but ¢, = 1. There exists an interval (0, 7,) of order w 
with 7, S x. If 7; = x, then we are through. If not, and if 


w 
. (Det 
j= QD eP2' 


i=0 











622 WOLFGANG SCHMIDT 


then ¢,‘? = t, = 1 and there exists a largest integer w. having 

& <2. 
Hence there exists an interval (j,,j2] of order we, j2 S x. If js = x, then 
(0, x} = (0, j:] U (ja, 2]. Otherwise, if 


je — > "2°. ag =t,= Re nian mw = ty, = l, : 


then there exists a largest w3, w; < we, having 


(3) 


Ge <4. 


We proceed as before. Since j; < j2 <..., we finally arrive at j, = x and 
(0,x] = O,j:i)U...U (jp+, js). The orders of the intervals are w > w, 
ee Wy > 0. 


2. Sums involving a function ¢(k,q). Let k, gabe positive and write 
¢(k, g) for the number of integers x, 0 S x < q, so that g.c.d.(x, ¢) S k. 


LEMMA 2. ; 
DX ok, gg? = v + O(vk + logy log k). 
q=l 


Note. Here and throughout the paper, the inequality indicated by the 
O-symbol holds for all values of ail variables involved. 


Proof. Clearly, 


} 
o(k,g) = > (2), | 
wig¢ w ' 
wsk 
where ¢(x) is the Euler ¢-function. Using the well-known relation 


d(x) =x D> uly)y™, 


we obtain 


> o(k, gq" 


min(k.¢ [(r/ew)] [Ce 
= 2 wd wayyy’ Dd 1, 
y=l y 


uw=1 


where |a] is the integral part of a. Thus 





whe 


(9) 





he 








ON DIOPHANTINE APPROXIMATION 


DL o(k, aa 


min (x, ») [(e/w)) 


= = w > u(y)y* + O(log » log k) 


w=1 y=1 


min(k, 2) min(k, o) 
=v >) w't(2)'+ o( > w*) + O(log » log k) 
1 


w= w=! 
=v + O(vk™ + log v log k). 


LEMMA 3. 


id 


> v(q)o(k, ga 


q=! 


= W(v) + O(W(v)k* + Q(x) log k). 
Proof. Put T1(k, 0) = 0 and 


Il(k, r) = > o(k, q)q 
q=l 
for r = 1. Lemma 2 yields 
(7) II (k, r)=rt+ O(rk—! + log r log k). 


Using partial summation we obtain 
dX v(9) elk, gg 
q=1 


= > ¥(9)(I(k, g) — MW(k,¢g — 1)) 


= 


r—1 
(8) = > i(k, g) (vig) — Wg + 1)) + Wk, v) vv) 
q=1 


r—1 
= > g(v(q) — ¥(qg + 1)) + ovo) + Rik, v) 


q=l 
= V(v) + R(k, v), 


where, according to (7), 


R(k, v) 


o—1 
(9) = (= (qk + log g log k)(W(q) — vig + 1) 


q=l 


+ O(vk™ + log v log k)W(v) 


623 


= o( woe "+ log k z. ¥(q) (log g — log (¢ — 1)) + log ky (1 ) 








624 WOLFGANG SCHMIDT 
Now 

DX ¥(g) (log g — log (g — 1)) 

q=2 


o( > ¥(q) log (1 + 5 | -)) 


O(Q(v)). 


(10) 


Lemma 3 is a consequence of (8), (9), and (10). 


3. Bounds for certain integrals. We introduce the following functions 


and integrals. 


1,if0 <0 < ¥(q) 
0 otherwise, 


B(q, 0) = 1 


v(q,0) = >> B(g, 99 — p), 


v(k,g,0)= > B(q,q0— >), 


dD 
g.c.d.(p.@Sk 


al 


1(q) = | v(a, 648, 


1 


1@:¢ = J wv(k, q, 8)d8, 


el 


I(k;q,7) = f x(k, g, O)y(k, 1, 0)d8, 


0 


V(u,v) = » ¥(q). 


We observe 


N(v, 6) = , a ¥(q, 4) 
q=1 


and put 
N(k; u, 030) = y rhea, 6). 
LEMMA 4. % 
(11) Tq) =¥@); Ika) = ¥(@)olk, a)" 
(12) I(k;q,7r) S ¥(g)¥(r) + Alki 9, PW@)e", 


where A(k;q, 17) is the number of solutions p, s of 











ee oe 





ON DIOPHANTINE APPROXIMATION 625 


(13) qs —rp =0 OsSp<gq 
having 
g.c.d.(p, g) S k, g.c.d.(s,r) Sk. 


Proof. I(q) = ¥(q) is rather trivial, while the second half of (11) follows 
from 


Ik,g= > J 8. 6q — p)dé 


g.c.d.(p.q@) Bk 


= ok ada f B(q, 6)dé. 
As for I(k; q, r), we have 


We split this sum into two parts, 


I(k; 9,7) = Io(k; 9,7) + Ih(k3 9,7), 


where J» consists of the terms with gs — rp ¥ 0. 


1 
(14) h(kiar) s > f B(q, 0q — p)B(r, Or — s)dé 
oomans® . 
1—(p/q) — 
= 2 B(q, 20’)A( rg — E— ) dé’. 
p.s —(p/q) q 
as—rpx#0 


To find an estimate for this sum, write g = qg’d, r = r'd, gs — rp = hd, where 
d = g.c.d.(qg, r). For given h, p is determined modulo g’. Hence 


Io(k; q, 7) 
<d> fea. ar6(,, ro’ — i) dé’ 
hx¥0 —on 
sd f J ; B(q, 0 a(r ro’ — nag) d0’d» 
= ¥(q)¥(r). 


In changing from the summation over / to the continuous parameter \ we 
used the fact that the function 


i 8(q, 90’) B(r, 70’ — ddq™*) de’ 


is monotonically decreasing in \ when \ 2 0, and monotonically increasing 
when A S 0. 








626 WOLFGANG SCHMIDT 


To prove Lemma 4 it remains to give an upper bound for J,(k; 9,7). In 
analogy to (14), we find 


vl—(p/q) 
1,(k;q,7r) = >> J B(q, 90’) B(r, r0")dée’ 
d. (p,q) Bk —(p/q) 
d.(s.s 


< A(k:¢,r)¥(q)q"° 


LemMMaA 5. 


el 
| N(v, 0)d@ = V(v) 


el v 
f N(k; u,v;0)d0 = >> ¥(q)o(k, gq 


q=u+1 


el ; 
j N"(k; u,v; 0)d0 S W*(u,v) +2 DU 
0 


q=u+ 


¥(q)de(Q), 


where d,(q) is the number of divisors of q not exceeding k. 


Proof. The first two assertions follow from (11). As an immediate conse- } 
quence of (12) we have 


1 
fe: u,v;0)d0< W*(u,v) +2 >> Alkig,r)¥(@)q'. 


u<rSqse 


Now P 
> A(k:g, 7) 
r=1 ( 
is equal to the number of solutions r, p, s of 
gs — rp = 0, Os p<gq, lara¢ 
g.c.d.(p,q) Sk, g.c.d.(s,r) Sk. 
V 
Define a, 6 by i 
a 3 g.c.d. (a,b) = 1. € 
g fr 
Then 6/g and g.c.d.(p, g) S k implies gd—' S k. Thus the number of possible 
choices for 5 is d,(q). Furthermore, there are ¢(b) S 6 possibilities for a and 
gb—' possibilities for r, once } is given. Hence V 
. ’ 
>. A(k:q,r) Sq d,(q) 
r=1 
and i 
= A(k:q,r)¥(q)q' S y > ¥(¢) d,(q). : 
u<re¢Se quetl Si 


_  —— 








ON DIOPHANTINE APPROXIMATION 627 


4. Proof of Theorem 1 (nm = 1). Write w(k) = [W(4)Q(A)] and define 
S, S’, S” as in § 1. Let L, be the set of all pairs (u,v), u € S’, v € S’, so that 
(w(u), w(v)] is an interval of any order ¢ with respect to w (see § 1), and 
w(v) S 2°. From now on, the numbers k,s are always connected by the 
relation 


(15) k = 2'. 


From here on, we make heavy use of the methods developed in (2). Write 
h* = h*(s) for the largest integer h* having w(h*) S 2°. 


i LEMMA 6. 


sl 
(16) 0s | (N(h*, 0) — N(R; 0, h*; 6))d@ = O(s 2°”) 
0 
1 . F 
(17) > f (N(R; u,v; 0) — V(u,v))"dd = O(s* 2°). 
(u.vreLes 0 


Proof. The first two equations of Lemma 5 give 


el 
J (N(h*, 6) — N(R; 0, h*, 6))dé 


= 7 
a* 
= W(h*) — au ¥(q) o(k, gq" 
’ es 
= O(W(h*)k* + Q(h*) log k 
according to Lemma 3. Since 
Q(h*) = O(2""), 


(16) follows. 
Using Lemma 5 again we see that a single integral in (17) does not exceed 
2D W(a)de(g) + 2¥(u, v)(¥(u,v) — SS vq) ok, gq). 
q=u+1 q=utl 


We first take the sum over those pairs (u,v) € L, where (w(u), w(v)} is an 
interval of fixed order ¢. Since intervals of order ¢ cover the positive axis 
exactly once, we obtain the upper bound 


a” 


a* 
le 2 = ¥(q)di(q) + 2V(h*)(W(h*) — > ¥(q)(k, q)q™'). 
nd -" mr 

We observe 


n* k 
DX v(q)dr(g) S$ 2" SS &* = O(2' log k) 
q=l t=1 

and using Lemma 3 we find the upper bound 


O(2* log k) + O(W?(h*)k-! + W(h*)Q(h*) log k) = O(s2’). 


Summing over ¢ and observing ¢ S s we obtain (17). 














628 WOLFGANG SCHMIDT 


LemMA 7. There is a sequence of subsets o1,02,... of the unit-interval with 
measures 
t= f a = O(s-**) 


N(h, 0) = V(h) + O(2!/2s?**) 


such that 


for any h with w(h) S 2°, h € S’, and any 6 in 0 S @ < 1, but not in o,. 


Proof. We define co, to be the set of all 6 in 0 S @ < 1, for which not both 
of the following two inequalities hold: 


(18) 0 < N(h*, 6) — N(k;0, h*;0) < s*** 2" 
(19) > (N(R; u,v; 0) — V(u,v))° < sit*o* 
(u,o)«eLe 


As a consequence of Lemma 6, 
pu, = O(s-*-*). 


If hk s h*,h € S’, then the interval (0, w(h)] is the union of at most s intervals 
(w(u), w(v)], where (u,v) € Ly. 
N(k;0,h;0) — ¥(h) = >> (N(R; u, 030) — ¥(u,v)), 


where the sum is over at most s pairs (u,v) € L,. This fact, together with 
(19) and Cauchy’s inequality yields for 0 Ss @ < 1, 6¢a,, 


(N(k;0, 4; 0) — W(h))? S s**2*, 
The last equation together with (18) gives Lemma 7. 


Proof of Theorem 1 (n = 1). Since }>s~'~* is convergent, there exists for 
almost all 6, 0 S @ < 1, an So = so(@) such that 6¢<¢, for s = so. Assume @ 
has such an so(@) and assume & to be so large that w(hk) 2 2*°. Choose s so 
that 2°"! S w(h) < 2°. 

Suppose h € S’. Then we have with Lemma 7 

N(h, 0) = W(h) + O(2?*s?+#) 
U(h) + OC) (h)Q'(h) log?+*¥(h)). 


Il 


Hence Theorem 1 holds for k € S’. By the same argument we can prove 
the Theorem for h € S”. 
To any h there exist h’, h” with h’ € S’, h” € S” and 
wh’) = w(h) = w(h”). 
|\W(h)Q(h) — ¥(h’)Q(h’)| 


lA 


Then 
|W(h) — ¥(h’)| S Q(A) SF QA) = YI)", 











r 


> 





ON DIOPHANTINE APPROXIMATION 
and similarly for ¥(h’’). Since 


N(h’, 0) < N(h, 0) < N(h’, 6), 


the case m = 1 of Theorem 1 follows. 
5. The case n 2 2. Using 


v— Do ok, g)q” 


=! 
= > (q" — ¢"(k,q))q™ 
Sn > (g — o(k, q))q" a” 
= alo - > o(k, g)q ') 


we easily generalize Lemmas 2, 3 to 


yi ¢"(k, g)q™ 
= 


v + O(vk™ + log k log v), 


DL v(q)e"(k, gq” = ¥(v) + O(W(o)k + Q(v) log &). 
q=1 


629 


In analogy to 8(q, @) of § 3 we define B(q, 6;,... , 0.) to be the characteristic 


function of the rectangle 


0S 6; < ¥:(q) (a@=1,... 
and put 
¥(q, 6; » 9,) —- 2. B(q, qo, _ Pi grees 99, = Pn) 
Davee re 
y(k3q,01,..., 6,) = B(q, gO: — pr,..-, Gn — Pn) 
P 6-0.4- Gt. 


I(q), I(k, q), I(k;q,7r) are now n-dimensional integrals. To find an 
bound for 


sl vl 
I(k;q,7r) = _ J a B(q, 90; — pi,..., ) 
Di .g.c.d.(pg.g) Bk 0 0 
81-8 


a 
-€.0.d.(8¢,7)Bk 
w=1,... 


we split this sum into m + 1 parts, 


I(k;q,r7) =Igo+...+],, 


where J, consists of the terms with exactly j indices i;,..., 1, 


gs; — rp; = 0. We find 


, 2) 


upper 


. 2, 


having 








630 WOLFGANG SCHMIDT 


Io(k: 9,7) S ¥(g)¥(r) 


= 


and 
I ,(k:9g,7) S&S CAR; 9g, NW(@)q-’ 


< 
S cA (k; 9g, r)v(g)q". 


There are no other modifications of any depth. 


6. On the proof of Theorem 2. For simplicity assume m = 1. We put 


J1,if050< yW(Q) 
(0 otherwise 


B(Q, 0) = 


and define 7(Q, @), J(Q) in an obvious way. Further 
vl ; 
I(Q, R) = j 7v(Q, @)y(R, 0)d8, 
V(u,v) = >> ¥(Q). 
u<Qsr 


We observe 


and put 


N(u, v, 6) = Zz 7(Q, @). 


u<Qsr 


We do not need the parameter k now, which was essential in Theorem 1. 
Lemma 4 now reads 


LEMMA 4a. " 
(20) I(Q) = ¥(Q) 
(21) I(Q, R) = ¥(Q)¥(R), 


if Q, R are linearly independent (there exists no p having Q = pk). 
(22) 1(Q, R) S ¥(Q)W(R) + ce Alga r)V(Q)G', 


if Q, R are linearly dependent. Here q,, r, are the first co-ordinates of Q, R and 
A (qi, 71) 1s the number of solutions p, s of 


gas — np =0 OSp<g. 


(20) and (21) are proved like (11), while the proof of (22) is like the one given 
for (12). Lemma 5 becomes 


LEMMA 5a. 


el 
J N(u, v, 0)d@ = ¥(Q) = V(u, v) 
0 


u<Qsr 


1 
f nu, », odo < W*(u,v) +c > ¥(Q)d(Q). 


u<Qsr 


a 














ON DIOPHANTINE APPROXIMATION 631 


All the other changes in the proof are obvious, except perhaps the definition of 
w(h), namely w(h) = [x(h)]. 


REFERENCES 
1. J. W. S. Cassels, Some metrical theorems in diophantine approximation I, Proc 
Soc., 46 (1950), 209-218 
2. ———— Some metrical theorems in diophantine approximation III 
46 (1950), 219-225. 
3. —— An introduction to diophantine approximation, Cambridge Tracts, 45 (1957). 


4. A. Khintchine, Zur metrischen Theorie der diophantischen Approximationen, Math. Z.., 
(1926), 706-714 


5. W. Schmidt, A metrical theorem in geometry of numbers, Trans. Amer. Math. Soc., 00 (1960), 
000-000. 


Camb. Phil. 


, Proc. Camb. Phil. Soc., 


~~ 


Montana State University 











A LOCAL PROPERTY OF MEASURABLE SETS 
W. EAMES 


1. Introduction.' Let & be a metric space with metric p, let C be a class 
of closed sets from @ and let r be a non-negative real-valued set function on C. 
We assume that the empty set ¢ is in C and that r(J) = 0 if and only if 
I = 9. For each set A in Q, we define ¢(A), 0 < o(A) < @ by: 


(A) = lim int > -(1(n)) | 


where the infimum is taken for all possible countable collections of sets J(n) 
from C such that: 


AC U I(n) 


and the diameter of J(m), d(I(m)), is less than e for every m. We assume that 
such a countable collection exists for every set A and every « > 0. ¢ is an 
outer measure function (3, p. 85), that is: 
(i) e(¢) = 0, 
(ii) If A C B, then ¢(A) < ¢(B), 
(iii) For any sequence of sets A(n), n = 1, 2, 3, 
= | | 


of UA (n)) < X o(4(m)). 


n=1 
A set A is said to be measurable if 
¢(B) = ¢(A NB) + oANB) 

for every set B in Q. All Borel sets are measurable (3, pp. 102-106). For every 
set A in Q, there is a measurable set B, called a measurable cover for A, such 
that A C Band ¢(B) = ¢(A) (3, pp. 107-108). That is, ¢ is a regular metric 
outer measure function. 

We also define for a set A and a point p in Q, the number D(A,p),0 < D(A,p) 
< o, by: 





ot = nw 
where the supremum is taken as J ranges over all sets in C such that p € J 
and d(I) < «. 


Received April 23, 1959 The author wishes to express his appreciation to Professor H. W. 
Ellis for suggesting this investigation. This work was done while the author held a National 
Research Council of Canada fellowship. 


632 











di 


CI 


wl 


TI 








A PROPERTY OF MEASURABLE SETS 633 


We show that, if the sets in C satisfy a certain regularity condition and 
¢(A) is finite, then: 
(i) D(A,p) = 1 for almost all p € A, 
(ii) D(A,p) = 0 for almost all p € A if and only if A is measurable, 
(iii) By considering the behaviour of D(A,p) it is possible to construct a 
measurable cover for A in a unique manner. 


These results are similar to those previously obtained for the case of Haus- 
dorff outer measure in Euclidean space (2, 5). 


2. Separated sets. Let A and B be two sets in @. A is said to be separated 
from B if, for every « > 0, there is an open set O such that ¢(A (\ O) < «and 
¢(B(\ 0) < « (2). Equivalently, A is separated from B if, for every « > 0, 
there is a closed set F such that ¢(F (\ B) < « and ¢(AX\ F) <« If A is 
separated from B then B is not necessarily separated from A. For example, 
let ¢ be linear outer measure in the plane, let A be the interior of a circle and 
let B be the circumference of this circle. However, we have: 


THEOREM 1. If g(A) is finite, and A is separated from B, then B is separated 
from A. 


Proof. Let O be an open set such that (A (\ O) < fe and (BM O) < he. 
Because 2 is a metric space, O is the union of a countable number of closed 
sets, C(m), nm = 1,2.... From the measurability of O, 


(A) — g(A MNO) < te. 
Choose an integer m such that 

¢(A (110) < g(AC\F) + fe 
where F is the closed set 


¥ C(n). 


n=1 


Then g(A (1) F) < e€ and ¢(BN F) < €so that B is separated from A. 
THEOREM 2. If A is separated from B, then o(A U B) = ¢(A) + ¢(B). 


Proof. Wecan assume that ¢(A) and ¢(B) are finite. Since A is separated 
from B and B is therefore separated from A, there are open sets O and G such 
that ¢(07\ B), ¢(OT\ A), ¢(GC\A), and ¢(GC\B) are all less than a 
preassigned « > 0. From the measurability of O and G, ¢(A (\O) > ¢(A) 
—e« and g¢(B\G)>¢(B)—e. From the _ subadditivity of 4, 
¢g(O(’\GC\ (A UB)) < 2e. Thus, 


¢(A U B) > o( (A U B) 1 (OUG)) > oA) + o(B) — 4e 


which, with the subadditivity of ¢, proves the theorem. 











634 W. EAMES 


THEOREM 3. If ¢(A) is finite, then A is measurable if and only if A is 





separated from A. 

Proof. Assume first that A is measurable. From the definition of ¢ we can, 
for a given integer m, obtain a sequence of closed sets I(n,i), i = 1,2,..., | 
from C with diameter <1/n, whose union covers A and such that: 

— ee l = ; 1 
XL r(I(n, i) — 5 < oA) < Le rl’) +5. 
i= i=1 
From the regularity of ¢, ) 
m I 
¢(A) = lim oa YU I(a, i) ( 
M20 i=1 
so there is an integer m(n) such that ; 
m(n) e 
g(A) ace 4 C) U I(n, i) < Qn 
i=1 “ 
Let 
F= (\ WU I(n,%). ; | 
n=1 i=1 
ox € 
F is closed, g(A (\ F) < «, and, using the measurability of A and F, . 
¢(F (\ A) — g(F MO A) = ¢(F) — ¢(A) <0, 
so that ¢(F (\ A) < o(F MA) <.«, and J is separated from A. 

If A is separated from A then B()\A is separated from B()\A for any set L 
B so, from Theorem 2, ¢(B) = ¢(B(\A)+¢(BC\A) and A is thus b 
measurable. 

THEOREM 4. If g(A) and ¢(B) are finite and o(A U B) = ¢(A) + ¢o(B), 
then A is separated from B. 

Proof. Assume first that B is measurable and assign an « > 0. From the 
measurability of B and the hypothesis, g(A (\ B) = 0. By Theorem 3 there 
is a closed set F such that o(B(\ F) < « and o(BC\ F) < «. Because FC 
BU (FC\B), we have ¢(A (\ F) < e€ and A is separated from B. If B is not 
measurable let C be a measurable cover for B. By the preceding, A is separated | 
from C and thus from B. 

3. The function D(A, ). This function has been defined in the intro- me 
duction. Several functions of this type, differing only in the class of sets J over 
which the supremum is taken, have been studied for the case of Hausdorff 
outer measure in Euclidean space by various authors (1; 2; 4; 5) and examples 
due to Besicovitch (1) and Nikodym (4) show that a slight change in this so 
class radically affects the function. If ¢(A) is finite, then our choice, namely 
“all J in C such that pe J and d(J) < «,”’ results in a function which is, for A 





set 
1us 


B), 


the 
ere 
‘< 
not 


ted 


tro- 
ver 
lorff 
ples 
this 
nely 
for 








A PROPERTY OF MEASURABLE SETS 635 
almost all p, the characteristic function of A if and only if A is measurable 
and thus it affords a local characterization of the measurability of A. 

THEOREM 5. If g(A) is finite, then, for every « > 0 there is ai > 0 such 
that: 


a NU 1(n)) < > r(I(n)) + ¢ 


n=1 n=) 
for any sequence of sets J(m),m = 1,2,..., from C with diameters less than 6. 


Proof. This has been proved (1, p. 427) for the case in which ¢ is linear 
outer measure in the plane and A is measurable. The proof here is the same, 
using the measurability of the sets in C instead of the measurability of A. 


THEOREM 6. If g(A) is finite, then D(A, p) > 1 for almost all p © A. 
Proof. Leta > Oand0 < b < 1. Let 7(a, d) be the set: 


T(a,b) = {p:p€ Aandifl€¢ CandpeE I 
and d(J) < a, then g(A 1’. I) < b-r(J)}. 


It is sufficient to prove that 7(a, 6) is null for all such a and 6. Assign’ an 
¢ > 0 and obtain a 6 by Theorem 5. Choose 6 < a. Let {J(n)} be a sequence 
of sets from C with diameter < 5, whose union covers A, and such that: 


g(A) —e< D r(I(m)) < o(A) + 
n=l 


Let A be the class of all sets from this sequence such that ¢(A (\ J(m)) > 
b-r(I(m)) and let B be the class of all the other sets from the sequence. 
Then: 


¢g(A) € g@AN) Ul(n)) + eAN U I(n)) 
4 
< > tr(I(n)) +e+ b- 7. r(l[(n)) 
rN B 
< ¢(A) + 2e+ (6 — 1): pi r(I(n)). 
B 
Thus 
>, r(1(n)) < 2e/(1 — bd) 
B 
and also, 
¢(T(a,b)) < g(ANUI(n)) < S rU(n)) +6 
B B 


so that 7(a, 5) is null, which proves the theorem. 


THEOREM 7. If ¢(B) is finite and D(A, p) = 0 for almost all pe B, then 
A is separated from B. 











636 W. EAMES 


Proof. Assign an e > 0 and consider the set 


B(n) = |p: p € Bandif IJ € Cand p € J and 
d(I) < 1/n, then o(A 1’ I) < e-r(J)}. 


By hypothesis, 
B-NCUB(n) CB 
n=1 


where N is a null subset of B, and thus (3, p. 95), 
¢(B) = lim ¢(B(m)). 


Choose an integer m so that 9(B) < ¢(B(m)) + « and let 6 be determined 
by applying Theorem 5 to B. Let {J(m)} be a sequence of sets from C with 
diameter less than the smaller of 1/m and 6, whose union covers B(m), and 
such that: 


@ 


> r(I(n)) — ¢ < ¢(Bim)) < > r(I(n)) +. 


Choose an integer r such that 


« 


> r(I(n)) <«. 


n=r+1 


Then, by Theorem 5, 


(p AU 1(n)) < 26. 


n=r+1 


By the definition of B(m), 


dan U 1m) < Y AN Mim) <e D e(1(m)) 


< e(¢(B(m)) + €). 
Let 


C= U I(n). 
n=1 


C is closed and g(B(\C) < 3c and ¢(A (\C) < € (o(B) + © so that A is 
separated from B. 


4. The regularity conditions. For the remainder of the paper we 
assume that to every set J from C there corresponds a set I’ from C such that: 
(i) I’ > {p: p(p, DI) < a-d(J)} where a is a finite number greater than 1 
and independent of J and p(p, J) is the distance from p to J. 
(ii) r(1’) < B-r(J) where @ is a finite number independent of J. 
(iii) for every « > 0 there is a 6 > 0 such that d(J) < 6 implies d(J’) < «. 











- 








A PROPERTY OF MEASURABLE SETS 637 


THEOREM 8. If ¢(A) is finite and B is separated from A, then D(A, p) = 0 
for almost all p € B. 


Proof. Assign e > 0, 7 > 0, and 6 > 0. Let O be an open set such that 
g(A(\0) <é and ¢(Bl)\ 0) < e&. Let A(e) = \p:b € B and D(A, p) 
> «}. Forevery p € A(e) (\ Othere isaset J © Csuch that p € J,d(J) < 9, 
ICO and ¢(A\ I) > €7r(J). Also, d(J) # 0, since d(J) = 0 implies that 
I is a set consisting of exactly one point, J C A /\ B, and ¢(J) > 0, which is 
impossible if B is separated from A. Let the class of all such sets be denoted 
by J. 

Choose a maximal class I(1) of disjoint sets from J with the property that 
5/a < d(IJ) < 6where a@ is the number referred to in (i) of the regularity 
conditions. Continue this process in the following way: let I1(1), 1(2), 1(3),..., 
be a sequence of maximal classes of disjoint sets from J such that if J € I(n) 
then J is disjoint from all the sets in any preceding class and 6/a" < d(J) < 
5/a"—'. Each I(m) contains at most a countable number of sets since r(J) > 0 
for every I € J and 

> 71) et. g@AND = am NU 1) < @ 
I St € I 
for any countable selection of sets from I(m). 

Thus the union of all the classes I (m) is a class containing at most a countable 
number of sets. With every set J(m),m = 1, 2,3,..., in this union associate 
a set I’(m) by the regularity condition. Then, 


Ale) NOC UI'(n) 


n=1 


for, suppose P € A(e) (\ O. There is a set J € J such that p € J and 


for some integer n. If 


then J ¢ I(n) so there is a set J € I(m) for an m < n such that J(\ J is not 
empty. Then: 


6 : 
so that p € J’ which contradicts the assertion. Thus: 


Y r'(n)) <8 rn) 


n=1 n= 1 


» 


< B/e- D> o(AN I(n)) 


n= 


< 6/4 nd 1m) <b 











638 W. EAMES 


By regularity condition (iii), d(/’(m)) may be made as small as desired for 
all m by choosing 7 sufficiently small. Thus, ¢(A (e) (\ O) < B-e 
so that 


e(A(e)) < B-e + o(Al(e) OO) < (8 + 6). 
Since A (e) C A(e’) if e < ’, we have: 


e({p:p € B and D(A, p) > 0}) = ¢ (lim A(e)) 


e~0* 


lim ¢(A (e)) 


= () 
which proves the theorem. 
THEOREM 9. If ¢(A) is finite, then D(A, p) = 1 for almost all pe A. 


Proof. In view of Theorem 6 we need only show that the set A(b) = 
{p:p € A and D(A, p) > 1 + 3} is null for all 6>0. Fix b> 0, assign 
¢ > 0 and obtain 6 > 0 by applying Theorem 5 to the set A. As in Theorem 8, 
obtain a sequence of disjoint sets J(m), m = 1, 2,3, ..., such that 


A(b) C U I'(n) 


n=1 


and d(I'(n)) < 6, r(I’(n)) < B-r(I(m)) and ¢(A C\ I(n)) > (1 + b)-r(T(n)) 
for all nm. Then: 


x 0 


ol MU 1(n)) => AN I(n)) > (1+ 5)-d r(U(n)) 
n=1 


n=1 n= 
and, by Theorem 5, 
e(A(b)) < (4 AU r'(n)) < db r(I'(n)) +6 
n=1 n=l 


and 


(4 NU 1(n)) < db r(I(n)) +. 
n=1 n= 
From these relations, ¢(A (b)) < e- (1 + 8/b) so that ¢(A(b)) = 0. 
THEOREM 10. If g(A) is finite and B is a measurable cover for A, then 
¢(A (VI) = o(BC\T) for every measurable set I. Thus, D(A, p) = D(B, p) 
for every point p. 


Proof. Since I is measurable, 


v(A) = oANTD + ¢(ANTD) 


Il 


and 


o(B) = o(BOAD+¢(BNN. 








nm 











A PROPERTY OF MEASURABLE SETS 639 
Thus, [o(A ODT) — o( BOD) + [e(A OD — ¢(BOD) = 0. Since A C B, 
both numbers in the square brackets are non-positive, so both are 0, which 
proves the theorem. 


The following four theorems contain results similar to those obtained by 
Jeffery (2) and Randolph (5) for the case of Hausdorff outer measure in 
Euclidean space. It is interesting to note that they hold for any function 
D(A, p) such that, if ¢(A) is finite, then: 

(i) D(A, p) > 0 for almost all p € A. 
(ii) D(A, p) = 0 for almost all p € A if A is measurable. 
(iii) D(A, p) = D(B, p) for every p if B is a measurable cover for A. 


THEOREM 11. If g(A) is finite and 


G = {p: p € A and D(A, p) > 0} 
then: 
(i) A U Gis a measurable cover for A. 
(ii) A is measurable if and only if G is null. 


Proof. Let B be a measurable cover for A and let 


C = {p:p € Band D(A, p) = 0} 
D = {p:p € Band D(A, p) > 0} 
E = |p: p € A and D(A, p) = 0}. 
By Theorems 9 and 10, C and E are null. By Theorems 3, 10, and 8, D is null. 
Since AUG = (B—C)UDUE, AWG is a measurable cover for A. 
Since A = (A UG) — G, if G is null then A is measurable. If A is measurable 
then: 
¢(A) = o(A UG) = o((A UG) NA) + o((A UG) NA) 
¢(A) + ¢(G) 


so that G is null. 


THEOREM 12. If ¢(A) is finite and A is measurable and A = BU C where 
B is separated from C, then B and C are measurable. 


Proof. The measurability of B follows from Theorems 8, 10, and the rela- 
tion: 


{p:p € Band D(B, p) > 0} C A and D(A, p) > 0} U 


[p: p & 
{p: p € Cand D(B, p) > 0}. 
Similarly, C is measurable. 

THEOREM 13. Jf ¢(A) 1s finite, G is defined as in Theorem 11, and: 


F = {p: p € Gand DIG, p) > 0} 
then: 








640 W. EAMES 


(i) FC A and A — Fis a measurable kernel for A. 
(ii) GU F is measurable and ¢(GU F) = ¢(A) — (A), where (A) is 
the inner measure of A. 
(iii) If A is not measurable then A is not separated from G. 
(iv) A — G is separated from A UG. 
(v) A ts measurable if and only if F is empty. 


Proof. Since A U G is a measurable cover for A, D(A, p) = D(A UG, p) 
> DIG, p), so that F C A. By Theorem 11, GU F is a measurable cover for 
(A U G) — A, soa measurable kernel for A is (A UG) — (GU F) =A —-F. 
GU Fand A — Fare measurable and disjoint, so that: 

g(GU F) + o(A) = o(GU F) + oA — F) = (A). 
If A is not measurable then ¢(G) > 0. Thus: 
¢g(A) + o(G) = (A UG) + o(G) > o(A UG), 


so that, by Theorem 2, A is not separated from G. Since A  G is measurable, 
(iv) follows from Theorem 3. If A is measurable then G is null so F is empty. 
If Fis empty then A = A — F is measurable by (i). 


THEOREM 14. If ¢(A) and ¢(B) are finite and: 
A, = {p: p € A and D(B, p) > 0} 
B, = |p: b € Band D(A, pb) > 0} 
then ¢(A») = ¢(B,). 


Proof. \f A’ and B’ are measurable covers for A and B respectively, then: 
A, = {p:p € ACVB’ and D(B’, p) > 0} U 
{p: p € A) B' and D(B’, p) > 0}. 


The first set on the right has outer measure ¢(A (\ B’) by Theorem 6, and the 
second set is null by Theorem 11. By Theorem 10, ¢(A (\ B’) = o(A' 1B’), 
so that g(A,) = ¢(A’ (\ B’) which, by symmetry, proves the theorem. 


REFERENCES 

1, A. S. Besicovitch, On the fundamental geometrical properties of linearly measurable plane sets 
of points, Math. Annalen, 98 (1927), 422-464. 

2. R. L. Jeffery, Sets of k-extent in n-dimensional space, Trans. Amer. Math. Soc., 35 (1933), 
629-647. 

3. M. E. Munroe, Introduction to measure and integration (Cambridge, Mass., 1953). 

4. O. Nikodym, Sur la mesure des ensembles plans dont tous les points sont rectilinéairement 
accessibles, Fund. Math., 10 (1927), 116-168. 

5. J. F. Randolph, Some density properties of point sets, Ann. Math., 37 (1936), 336-344. 


Queen's University 
and 
Sir John Cass College 











els 


ont 








ON A CLASS OF NON-SELF-ADJOINT 
DIFFERENTIAL OPERATORS 


R. R. D. KEMP 


The problem of spectral analysis of non-self-adjoint (and non-normal) 
operators has received considerable attention recently. Livsic (5), and more 
recently Brodskii and Livsic (1) have considered operators on Hilbert space 
with completely continuous imaginary parts. Dunford (3) has generalized 
the notion of spectral measure and defined a class of spectral operators on 
Hilbert and Banach space. Schwartz (8) and Rota (7) have investigated 
conditions under which a differential operator will be spectral. The work of 
Naimark (6) and the author (4) on non-self-adjoint differential operators 
leads to an expansion theorem which implicitly defines a type of spectral 
measure. However the projections involved in this will not in general be 
bounded, much less uniformly bounded. 

The present paper is a generalization of (4) to mth order differential operators. 
If p(w) = uw" + ay"! + ay" * +... +a, is a polynomial with complex 
coefficients, the differential expression p(—iD), where D = d/dx, can be 
used to define a closed operator on L?(— ©, @) for 1<p<o. On 
L?(— ©, ©) it is a normal operator, and its spectrum {A|A = p(t) for real ¢} 
is the same in all these spaces. We shall consider operators L arising from this 
operator Lo by the addition of a linear differential operator of order n — 2 
with coefficients which are suitably small at +. The previous paper (4) 
dealt with the simplest case m = 2, a; = a2 = 0. 

We shall analyse the spectrum of LZ and show that it is determined by p(y) 
except for a bounded set of characteristic values which are the zeros of certain 
analytic functions. We shall also obtain an expansion of the Green’s function, 
and from this an expansion in characteristic functions for a suitably restricted 
class of functions. 


1. Solutions of Ly = Ay. Now L=1,+L; where L; = D2." 6,(x) 
(—iD)"-/. We assume that (x? + 1)’ b,(x) € L'(— ©, @) for a suitable r. 
This 7 is the multiplicity of the root of p’(u) = 0 which has highest 
multiplicity. 

The equation Ly — Ay =f is equivalent to a system of m first order 
equations 


(1.1) y’ = [A + B)Y + F, 


Received June 22, 1959. 
641 











642 R. R. D. KEMP 


where Y and F are m X 1 matrices with entries y;, y2,..., y, and 0,0 
0, i"f respectively, and A and B are nm X nm matrices with entries 


yx = 6 x1 = Pe eh ee + 5 in 5x0" \ and Dx == 5 pnt” * +", 1 (x) 


respectively, with the convention that 5,(x) = 0. 

In order to obtain the Green's function for Ly — \y = f we first construct 
certain solutions of (1.1) for F = 0, and use them to construct the Green's 
matrix for (1.1). We note that if \ is such that p(u) = \ has ndistinct solutions 
M1, #2, - ~~» Ma, then Z’ = AZ has a fundamental matrix Mexp[i@x] where 
6 = [iv;5n) and M = [(iz,)*-']. Using this we obtain an integral equation 
equivalent to (1.1) with F = 0 in the form 


(1.2) Y(x) = Mexp[idx]co + {* Mexp[i@(x — £)] M-'B(E) Y(é)d E, 


where ¢o is a constant m X 1 matrix and the lower limits on the integrals 
(each element in the column matrix) are arbitrary. 

We shall obtain solutions of (1.2) which are asymptotic to solutions of 
Z' = AZ, but before stating this result we must make some additional remarks. 
If A, is such that p(u) = A, has m distinct solutions then the same is true for 
\ sufficiently close to \,, and the solutions y;(A), w2(A),..., un(A) of p(w) = A 
are analytic functions of A in this neighbourhood of \;. These functions have 
branch points at A,° = p(u,®) 7 = 1, 2,...,"— lwhereyw,®j7 = 1,2,..., 
n — 1 are the m — 1 solutions of p’(u) = 0. In any simply connected region 
containing no A,° the functions yz, (A) are analytic. In solving (1.2) the curves 
vy» defined by the equations Im uz, = Im yu, will be important. We shall first 
solve (1.2) in regions D which are simply connected, bounded away from 
branch points, and contain none of the curves y,, in their interiors. 


THEOREM 1.1. There are solutions $1, $2, ... , dn and $1, 2, .. . , dn of (1.2) 
which exist for all bounded d in D. The matrices © = (6; $2... ¢,] and @ = 
[o1, 62. ..G,| are analytic in (A € D) for fixed x, and have the following 
asymptotic behaviour: 


(1.3) @(x,’) = M exp[i6x] (J + o(1))x—- @, 
(1.4) (x, A) = Mexp[iéx] (J + o(1)) x — @, 
where I = [6| is the identity matrix. 


This theorem is a modification of Theorem 8.1 in Coddington and Levinson 
(2; p. 92), and its proof will be omitted. We shall need to know the relation 
that #,(x, \) and ,(x, \) in D, bear to the corresponding matrices in D, 
where D, (\ Dz is a portion of one (or more) of the curves 7 «x. By an examina- 
tion of the particular cases of (1.2) used in proving Theorem 1.1 one can 
see that ¢,:(x, A) = o,2(x, A) on y», unless p = jor p = k, and that ¢n(x, A) = 
y2(@X, X) + Cebe2(x, A) + a linear combination of terms ¢,2(x, \) which are of 




















NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 643 


lower order of growth at ~ than @, or ¢;. A similar relation holds for ¢, and 
completely analogous results are true for the ¢,'s. The generalization to the 
case where several of the curves 7 coincide is immediate. 

We shall also be interested in the asymptotic behaviour of @ and @ as 
\\| + ©. We see very easily that for large |A| the u4,’s can be renumbered so 
that pw, = a,A”*(1 + O(\A\-"")) where 0 < argA'" < (2x)/n and 
a, = exp(2xji/n). Since we are bounded away from the branch points it is 
easy to show by direct calculation that each entry in M-' B(x) M is less than 
or equal to K|A|~'" > 52.2"|b,(x)|. From this we obtain 


(1.5) @(x, A) = M exp[i0x](J + 0(|A|-™)) as |A| — @, 
and 
(1.6) (x, 4) = M exp[idx|(J + 0(\A|-™)) as |A| > @, 


provided |A| + ~ while A € D. It is not hard to see from the asymptotic 
behaviour of the u,’s that such regions D do exist. 

If D is included in another such region D,, which extends closer to the 
branch points, the solutions making up © and @ are changed, but 4, = #C 
and @, = $C’ where C and C’ are constant matrices. It is easy to use (1.3) 
and (1.4) to conclude that C has units along the main diagonal and zeros 
below and that C’ is the transpose of such a matrix, provided that the u,’s 
are numbered so that Im uw, > Imuwe >... > Imy,. 

Finally we shall also need solutions of (1.1) when A is a branch point of 
the functions u,(A). At such a point the uw ,'s coincide in groups and the solutions 
of p(u) = A wiil be denoted by yi, we, ... , wu, with multiplicities m,, m2, ..., 
m, (> ja1"m, = n). A fundamental matrix of Z’ = AZ will then have the 
more complicated form M,D exp [ix] where 6, is a diagonal matrix with 
mM, 44'S, then mz wo’s, etc. down the diagonal. The matrix D can be partitioned 
so that it has zero matrices off the diagonal and blocks D,, De, ... , D, down 
the diagonal. D, is m, XK m,, has zeros below the main diagonal, and (D,)), = 
x*?/(q — p)! for g > p. Thus D(x) = D(—x) and D commutes with 


exp [i@,x]. The columns of M, are determined in r groups of m,,..., m, 
columns. The first column in the kth group is a m X 1 matrix satisfying the 
equation (A — ip,)z;“ = 0, and the jth column in this group is a solution 


of the equation (A — ip,)z;“ = 2,1 (j = 2,... , m,). In particular, a 


suitable choice of constants leads to the following formula for the /th entry 


in 2, 


, [| g—a) 
(2)). = ee (ing)? l>j 
| 0 L<j. 


Thus the equation corresponding to (1.2) is 


(1.7) Y(x) = M,D(x) exp [i@:x]eo + f° MD(x — §) exp [160,(x — £)] 
M,'B(&) Y (&)dé. 








644 R. R. D. KEMP 


THEOREM 1.2. <a exist solutions (x) and (x) of (1.7) for 


j=l,. ;k=1,..., m, such that 
aie 
) ipjz (7) 
o, (x) = 7 — Di’ “7 (s’ +0(1)) as x7. 
The solution ¢,‘” (x) has the same asymptotic behaviour as x > — @. 


The proof of this theorem is a modification of that of Theorem 1.1, and will 
be omitted. The appropriate modification is described in Problem 35 of 
Coddington and Levinson (2, p. 106). We might note that it is at this point 
that the full strength of the assumptions on 6,(x) are used (in Theorem 1.1 
it is only necessary to assume 6,(x) € L'). 


2. Construction of the Green’s function. We shall now discuss the 
solution of (1.1) for F # 0 when the equation p(u) = \ has no real solutions. 
For any matrix or vector function we shall use the following notations: 


A(x)? = >> |Ag(x)|’, with |A (x)| = |A(x)]' 
tJ 
and 
j er ) l/p 
A\ll, = A(x)\?dx¢ . 
P } Mad { 

As p(u) = \ has no real solutions we can number the solutions yy, po, . . . , Mn 
so that _ Imw,; >Imu2>... >Impp, > 0 > Im unis > IM ume >... > 
Im pp. Thus ¢, oo, .. . , dm are exponentially small at © and ¢mii, dmas,--- 
¢, are exponentially small at — ©. We shall partition our matrices as follows: 


#1; | ; kK $:] 
- @ =|. . 
. ie Doo)’ Do, Doo)’ 


where ®;; and $,; are m X m and the rest coherent with this. We now define 


Pi Mal 
¥ = 4. 
bw Po» 
and provided that ¥—' exists 


| Sut c) Ol, 


tx 
(x) 01” ) s<% 


0 $u(s) | — 
— wa Vv (é x < é&. 
5. 20(x) os 
THEOREM 2.1. Jf Vis non-singular for a particular value of \ then K(x, &, d) 


is the Green’s matrix for the solution of (1.1). 


Proof. By this we mean that if F is a vector function with ||F\|, < © 
(that is, F € L”) then the vector function 








or 


ne 











NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 645 


(2.1) g(x) = [Ke & Fat 


is the unique solution of (1.1) which belongs to L’. 

If two such solutions exist then there is a solution of (1.1) with F = 0 
which belongs to L”. As this cannot be exponentially large at either + © or 
— o it isa linear combination of ¢1, $2, . . . , @ém; and also a linear combination 
of bmi, Om+2, - + + » On» A non-zero solution cannot have this property if V is 
non-singular, so an L? solution of (1.2) must be unique if it exists. 

An examination of the definition of K(x, &, A) yields the fact that if the 
integral in (2.1) exists it must be differentiable and satisfy (1.1). Thus the 
burden of proof is in showing that 9 defined by (2.1) exists and belongs to L’. 

Since @ and @ are both fundamental matrices 6 = C where C is a constant 
matrix. We partition C, C-', @-', @-' in the same way as ® and @, and use 
the notations @-' = [6], é-' = [6”], C = [C,,], and C- = [C”]. We also 
note that if \ is not a branch point @ and @ behave asymptotically like 
M exp [16x] so their inverses behave asymptotically like exp [—ix|M~—'. 
Although such a precise statement cannot be made if A is a branch point, 
bounds on the elements in ®~' and @~' can be obtained. 

For x and £ non-negative we write K(x, &, A) in the form 


Pee ae 
oer J ‘le (), &<x, 


K(x, t,) = § 0 A 
— (x) \. 4] #”*(é), x < &, 


where A = C'?(C**)-'. A careful examination of this yields the fact that, 
whether J isa branch point or not, each element is bounded by K exp[—4|x—&|] 
where 6 and K are positive constants and 6 < min[|Imyz,|, |[mu,4,|]. This 
latter condition allows for bounds on the terms which are of order x*e‘*»’ 
for some k. 

Ifx <0 < £ we rewrite K(x, &, A) in the form 


a, 10 @ 
K(x,8 A) = — &(x) |) | #70), 


and again find that each element is bounded by K exp|—4|x — &|]. 
On performing similar analyses for x, — non-positive and for x > 0 > & we 
obtain 
(2.2) |K (x, &,A)| < K exp[—d|x — &|], 
where 6 and K are as before although K may have been increased. 
Thus, from (2.1) 


y(x)| <K | e*'*' F(E) |e 


®ap i \ e Ne ) ie 
<K} cor "| P(E) |? | | cers “dé ( 











646 R. R. D. KEMP 


by Hélder’s inequality. Thus, using a change of variables and the Fubini 
theorem we obtain 


f y(x)|"dx < K?(4/q6)”"* | J e PEI) B(x — £)/Pdt dx 


< K’(4/q6)""* { P ie vf F(x) raxt dé 
< K, F 4 
Thus y(x) exists and belongs to L” for all p > 1. 


COROLLARY 2.1. The Green's function for the differential operator L is 
G(x, &, A) = Kig(x, &, A) where Ky,(x, &, \) is the element in the first row and 
nth column of K(x, &,), provided that K(x, &, \) exists. 


We might note that K(x, é,A), and thus G(x, £,A) may very well exist 
even if p(u) = A has real solutions, although they have not been proved to 
be Green’s functions in this case. 

Since the theorem depends upon WV being non-singular we must examine this 
question in detail. Note that for real ¢ the equation \ = p(t) defines a curve 
in the complex plane, which will in general split the complex plane up inte 
several regions D,(j = 1,2,..., bp). Suppose Ap is in one of these regions and 
does not lie on any y,;,. Then in a neighbourhood of A» no Im yw; changes sign. 
Thus in this neighbourhood m is fixed and K (x, &, \) is analytic in X. In crossing 
a Ys while remaining within D, the ¢,'s and ¢,’s may change, but from the 
considerations of §1 we see that if on one side Im uw; > Impwso >... > Imu, 
then on vy», the ® matrix from that side, ,, will have the asymptotic form. 


®, = Mexp [ix] A, [J + o(1)])asx— © 


where A, has units along the main diagonal and zeros below it. From the other 
side the only difference will be that the order of the u,’s has been altered. Thus 


@, = M exp [16x] QA2 [J + o(1)|) asx @ 


where A, is of the same form as 4, and Q rearranges the columns of M exp|i@x] 
appropriately. Hence ®, = $,A,-'Q-'A,. Similarly 6, = 6, B,-' Q-' B, 
where B, and B, have units along the main diagonal and zeros above. Note 
that A;-' and B,-' have the same form as Az and By respectively and that it 
is impossible to have a y, here where 7 < m < k or k < m < j. This implies 
that in Q = [Q,,], partitioned as before, we have Qi2 = Q2; = 0. Using this 
and the definition of V we see that 


Py; ‘A 
9 = ° 
(2.3) v, vl? P|" 


where P;; and P22 have determinant +1 (as A ;, B;, and Q have this property). 
It should be noted that a particular set of solutions ¢, or ¢, exist in a set 
which is bounded away from the branch points. Thus as well as noting the 











ini 


ler 
1uS 


Ax | 
B, 
ate 
it 
ies 
his 








— 


—— 





NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 647 


behaviour of ¥ and finally of K(x, &, \) across the curves 7», we must consider 
how they are affected by a replacement of the ¢,'s and ¢,'s by a different set 
which exist in a larger region extending closer to the branch points. An 
examination of the changes required in the particular cases of (1.2) to obtain 
such a new set of solutions shows that in their common domain the two W's 
will again be related by an equation of the form of (2.3) with P,,; having units 
on the main diagonal and zeros below, and P22 being the transpose of such a 
matrix. 
These remarks may be summarized in the following theorem. 


THEOREM 2.2. If the branch points are removed from the regions D, the 
matrix K(x, &,) is analytic in the remaining portion except at points where V 
is singular. If we define a function W,(\) = + (det V) exp(ia;x], by fixing 
the sign at some point and then choosing coherent signs on the two sides of each 
7 we obtain a function locally analytic in D, except at branch points, which may 
be double-valued. | Note that W,(d) is independent of x). 


Proof. The first statement is obvious from (2.3). If D, contains no branch 
points the functions m;(A), we(A),...,.(A) are everywhere distinct and 
analytic throughout. Thus W,(A) is also analytic throughout, but if D, contains 
a branch point continuation of w;(A), ... , #,(A) around a curve surrounding it 
it will result in returning to pec) (A), . . . , Mec) (A) where x is a permutation of 
the integers 1, 2,.. . , 2 which is not the identity. This may result in a change 
of the sign of W,(A), in which case W,(A) is double-valued in D,. 

Thus the points where W is singular in any such D, are determined by the 
zeros of a (perhaps double-valued) function locally analytic except at branch 
points. Thus unless this function is identically zero the points where WV is 
singular form a countable set in D, with limit points (if any) at the branch 
points and on the boundary. We will attempt to characterize this set somewhat 
more completely, and to characterize the regions D,; where W,(A) can be 
identically zero. Such D,’s will be called exceptional. 

On investigating the region of validity of the original definition of W,(A) 
by a determinant we see that it is valid across an arc of \ = p(t) which does 
not coincide with a y.:(= yu) with k < m <1. It is easily seen that such a 
x, would have to be Ym m41 $0 that this portion of \ = p(t) is traversed twice. 
Thus the original definition of W,(A) is valid across an arc of \ = p(t) which 
bounds D, and contains no double points. Hence if W,(A) # 0 the limit 
points of its zeros can only lie at branch points in D,, boundary points of D, 
which are multiple points of the curve \ = p(t), or outside of D,. One notes 
that the original definition of W,(A) is valid outside of D, up to y,; with 
k < m <1 (using (2.3) to cross other y,,) and if this results in it being valid 
in a neighbourhood of some arc extending to we may use (1.5) and (1.6) 
to conclude that in this neighbourhood 

W (A) + e1? det { M exp [10x] (J + O(|A|—-""))} 
+ (det M)(1 + O(/A\-™")) as |A| @ @. 


lI 











648 R. R. D. KEMP 


For large j|A| we again use the asymptotic behaviour of the appropriately 
numbered y,'s to find that 


det (M) = +n (—a)-t Qhin—1)(n--2) [1 + O(JA|-*)] as [A] > @. 


Thus W,(A), which can be so continued, cannot be identically zero. This 
proves the following theorem. 


THEOREM 2.3. In order to be exceptional, D,; must be bounded from @ by 
curves yx, with k << m <l. In any non-exceptional D, (or an exceptional D, 
where W,(A) ts not identicaily zero), the points where V is singular make up a 
discrete bounded set with limit points at the branch points, or at points Xo on the 
boundary where p(u) = Xo has more than one real root. 


We might note that there is always at least one unbounded D,, which must 
therefore be non-exceptional, and that exceptional D,’s can exist, for example, 


if 
a ee ee ee 
p(u) -|2. (wu —1)° - ES (wu — 1) +33 -1)- | 
it is found that A = u + w = p(t) is the curve u = (r? — 1)?, v = r(r? — 1), 
and that the loop in it between r = — 1 and r = + 1 is traced three times 


(so encloses an exceptional region). 

It is perhaps also worthwhile to remark that this classification of the regions 
D,, and the characterization of the possible limit points of the zeros of W,(A) 
depends only upon Ly = p(—iD), and not on the perturbing operator L). 


3. The spectrum of the operators L and L*. We have seen above that 
there are values of \ for which the differential equation Ly — Ay = f possesses 
a Green’s function G(x,,A) which generates a bounded operator on 
L?(— ~, ~) forl < p < o. Fix such a value Xo. 


THEOREM 3.1. The operator L = Ly + L, with domain D,, = ly © L? 0) 
C™"'|\y"-) is absolutely continuous and Ly € L?}, is a closed operator on 
L? (— ~, &), 


Proof. Let y, € Dri», Yn > Yo and Ly, — f. Now 
yn(x) = f G(x, &, Xo) (L — Xo)yn(E)dE, 
and the boundedness of the operator generated by G implies that 
Yo(x) = f G(x, &, Ao) Lf(E) — Aoyvo(E) ]dé. 


From this equation it follows immediately that yo € D;.p and Lyo = f. 

We shall now introduce the adjoint operator L on L*(— », ). Here 
(1/p + 1/q = 1). If p= 1, g = © and if » = © we shall introduce an 
adjoint on L'(— ©, @) although this is not the dual space. Let L, = i DI,-1+ 
a, + d(x) where Lo = landk = 0,1,..., n, and define the operator L with 








~~ oo eS 


a 











NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 649 


domain Dz, = {y € L*|L, y is absolutely continuous k = 0,1,...,2—1 
and L,y € L?} by Ly = Lyy. 


THEOREM 3.2. For 1<p< @L om L* is the Banach space adjoint of 
L on L?; while for p = @ we define L’ as an adjoint to L on L' by D(L’) = 
(2 € L'| there is 2 € L' with [- 2Lydx =f". 2'ydx for all y € Dz} and 
L's = 2' forz € D(L’), and again find L = L’. 


Proof. in either case we must show that if z © L* such that there is 2’ € L* 
with fi sLydx = f__2'ydx for all y€ Dy, then 2€ Dy, and Lz =~2’. 
If f = (L — Xo)y then we may rewrite the above equation in the form 


fs] a f G(E, x, Ao) {2’(E) — nus(e)}at [ae = 0. 


This holds for all f in the range of L — Xo, which is all of L? so 
2(x) = fee. x, do) {2’(E) — doz(E) }dé. 


If F is the column vector with 2’ — \gz in its first position and zeros elsewhere 
it is easy to see that z is the mth entry in the column vector Z(x) = ef 
K(&, x, Ac)” F(é)dé. Using the definition of K(x, —, 0) we see that Z’ = — 
(A? + B*)Z + i"F, and an examination of this set of equations yields the 
fact that z € Dz, and Lz = 2’. 

Conversely, ify © D,,,and z € D;,, it follows easily that faLydx = SyLedx. 


COROLLARY 3.1. For 1<p< @ L on L? is the Banach space adjoint of 
Lon L‘. For p = 1 L’ = Lwhere L’ is defined as in the statement of the Theorem. 


Proof. Except for the case p = © this follows from Lemma 1.4 of Rota (7). 
For p = © the graph of the Banach space adjoint of L’ is the closure of the 
graph of L in the L' @ L' topology on L® @ L”. It is easy to see from the 
existence of a Green’s function that the graph of L is closed in this topology, 
so the result follows. 

Thus the Banach space adjoint operator is closely related to the Lagrange 
adjoint differential operator (which does not exist in the normal sense unless 
b,(x) € C**). For p = 2 the usual Hilbert space adjoint of L is given by 





L*y = Lg for y € Dz». In order to avoid making separate statements we 
shall define L* on L” by this equation for y € Dz, = Dy» ». 

Since the solution of (L* — X) y =f will be given by the mth component 
of a solution to Y’ = — (A* + Bt) Y + (—1)" F where A* and B* are the 
conjugate transposes of A and B respectively and F is a column vector with 
f in the first position and zero elsewhere, the work of §§1 and 2 carries over 
almost completely. The polynomial p(x) is replaced by p* (u) = p(@). The 
quantities associated with the adjoint by this and subsequent work will be 
denoted by an asterisk superscript and to avoid confusion the conjugate 
transpose of a matrix C will be denoted by C™. 











650 R. R. D. KEMP 


Many results about the spectra of LZ and L* are implicit above, but we shall 
gather them together here. We shall denote the resolvent set, spectrum, point 
spectrum, residual spectrum, and continuous spectrum of L by p(L), «(L), 


Po(L), Ro(L), and Co(L) respectively. 


THEOREM 3.3. If p(u) = A has no real solutions  € p(L) or X € Pa(L) 
and if d is not a branch point then © Pao(L) if and only if W,(A) = 0. The 
curve X = p(t) ts contained in o(L), contains Ca(L) and Ra(L), and the points 
of Pa(L) and Ra(L) lying on it form a nowhere dense set on any arc which does 
not lie between two D,'s in which the W,()'s are identically zero. Each portion of 
o(L) is independent of p except for the case p = ~, where a(L) is all Po 
(L) except perhaps for branch points lying on X = p(t). 


Proof. If p(u) = Xd has no real solutions Theorem 2.1 shows that either 
d € p(L) or d1, o2,..., Gin, Grete Get + + os ¢, are linearly dependent. In 
this case there are constants c,, not all zero such that 


m n 


f 
M 
} 
ll 
2 


¥ 
ful jam+1 
If x is the first component of y it is clear that x € L? for any p 2 1 and 
(L — A)x = 0, s0A € Po(L). 

On an arc of X = p(t) we have W,(A) from one side and W;(A) from the other 
side. If both are identically zero then the points of the arc are all in the closure 
of Po(L), and thus in ¢(L). If W;(A) is not identically zero, then at any point 
on A = p(t) where it is not zero the only solution of Ly — Ay = f where 
f(x) = 0 for |x| > a, which is in L’, is the first component of oe K (x, &, Xd) 
F(é)dt, where F(£) is the column vector with 7"f(£) as the last entry, and all 
others zero. If (L — A)y = 0 has a solution in LZ? W,(A) must be zero as we 
shall see below so this is the only possibility. Now either ¢,,(x) — ce'” as 
x— © Or Omii(x) — ce'™ asx — — o. Thus in order to have this solution 
belong to L? (P # ~) we must have the mth or the (m + 1) st entry in 


| VW; (E) F (dé 


equal to zero. It is easy to choose F so that this is not true, so (L — A) 
cannot everywhere be defined at such a point and A € o(L). As all other 
points on A = p(t) are in the closure of the points already mentioned we see 
that A = p(t) is contained in o(L). 

In order to have Xo = p(t) belong to Po(L) for p # © we must have a 
linear relation among the solutions which are exponentially small at +2 
and those which are exponentially small at — @. This implies that W,(A) 
and W,(A) (coming from the two sides) must both be zero and thus such 
points cannot be dense on an arc of A = p(t) unless W,(A) and W,(A) are 
identically zero. The points of Re(L) cannot be dense on an arc unless those 
of Po(L*) are dense on the corresponding arc of A = p*(t), which means 











—_-> ——- + 


NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 651 


that W*,(A) and W*,(A) are identically zero. Thus the points off this arc of 
\ = p(t) are all conjugates of points in Pe(L*), and so belong to o(L). By 
the above reasoning they belong to Po(L) and W,(A) and W,(A) must be 
identically zero. So Ro(L) cannot be dense on an arc of A = p(t) which does 
not lie between two D,’s in which the W,(A)'s are identically zero. 


For p = © we note that unless \ is a branch point on A = p(t) there are m 
solutions of (L — A)y = 0 which are bounded at @ and at least n — m+ 1 
which are bounded at — @. As there must be a linear relation among n + | 


solutions, we see that A € Poe(L) for this case, and as the branch points on 
\ = p(t) lie in the closure of these points they also lie in o(ZL). 

In a consideration of L* we see that its properties can be developed from 
the system adjoint to (1.1). Thus as in Theorem 3.3 we see that if p*(u) = A 
has no real solutions then A € p(L*) or \ € Po(L*). If X € p(L*) then 
K < p(L) and if } € Po(L*) then \ € o(L). In the latter case as p*(u) = A 
has no real solutions neither has p(u) = X so X € o(L) imples that X € Pe(L). 
Conversely we see that if p(u) = A has no real solutions and A € Po(L) 
then X\¢ Po (L*). Again from the proof of Theorem 3.3 applied to L* and the 
adjoint system we find that the curve \ = p*(¢) splits up into Po(L*), Re 
(L*), and Co(L*) as described there, and is contained in o(L*). This proves 


CorOLLARY 3.2. If X does not lie on X = p*(t) then » © Pa(L*) or p(L*) 
according as X € Pa(L) or p(L). The curve = p*(t) splits up into Po(L*), 
Ra(L*), and Ca(L*) as described in Theorem 3.3, and is contained in o(L*). 


4. The spectral resolution of L and L*. For certain special cases we 
shall obtain a type of spectral resolution. We first obtain an expansion of the 
Green’s function in terms of a sum of eigenfunctions and an integral involving 
improper eigenfunctions. From this we can develop an expansion for a suitably 
restricted class of functions, prove an analogue of the Parseval equality, and 
define for each bounded Borel set M a closed, densely defined operator E(/) 
which commutes with Z and has the properties: E(M)E(N) = E(M ()\ N) 
and E(M) + E(N) = E(M UN) if MO\N =@. In special cases these 
projections may all be bounded, or even uniformly bounded, in which case L 
is an unbounded spectral operator. We shall now proceed to develop the 
expansion of the Green’s function for the special cases under consideration. 

We shall assume that Po(L) and Po(L*) are finite and do not intersect 
\ = p(t) and } = p*(t) respectively. We shall also assume that on A = p(f) 
and \ = p*(t) the functions W,(A) and W*,(A) which are defined, are not zero. 

This results in considerable simplification, and it is quite possible that the 
method may be generalized to deal with the less special case where the only 
additional assumption is that Po(L) and Po(L*) are finite. Without this 
assumption, however, the convergence difficulties seem to be quite formidable. 

We shall consider the contour integral 
(4.1) 4 GO, 6) ay = 1(8, R), 


2riJ Carn w—X 











652 R. R. D. KEMP 


where C;.z is a compound contour which we shall now describe. Let H,’ 
consist of all points \ which are within a distance 6 of \ = p(t). This set has 
a boundary consisting of piecewise differentiable curves. Let H;,2 = {X||A| 
< R,\ ¢ H;’}, and let C;, 2 be the boundary of H;_ 2, traversed in the positive 
sense. We assume 6 is so small and R so large that Po(L) C H;.z and so that 
for each D,, D, (\ H;,2 is non-empty. 

The evaluation of J(4, R) will be performed in two ways: by residues, and 
by direct integration for the case 5-0, R-— ©. We shall first proceed with 
the residue technique. The singularities of the integrand occur at yu = \, 
and at w = Ay, Ae, ..., Ap Where Ay, Ao, ... , A, constitute Po(L). We note that 
Ci, consists of loops, one in each D,, and each of the singularities is enclosed 
in one of these loops. Thus, by residues, /(6, R) is simply the sum of the 
residues of G(x, =, u)/(u — A) at each of these singularities. 


LemMMA 4.1. The residue at wu = d is G(x, —,) and that at d, is minus the 
singular portion of the Laurent expansion of G(x, &, \) about d,. 


Proof. The first statement is obvious, and the second is almost so. Since 
the singularity of G(x, &, A) at A, arises from a zero of the appropriate W,(A), 
it must be a pole. Suppose that this pole is of order r, and that the singular 
part of the Laurent expansion is 


> GE (x, (A — d,)™. 
a=! 


Then in the neighbourhood of A, the integrand of J(4, R) is equal to 


os =a 5 A/, ~e 
es (1 —Sa) A - ») 
1 Tj _ T;—™m — 
= Fon 2, Hm ™ Le = A) Gaal, &) 


+ terms in higher powers of (u — X,). 


Thus the residue is obviously 


rj-1 r 
— > A—-A).” GRi(x, 8 = -— SE (A vy GY (x, 8). 
a=1 


a=() 


In order to resolve the G,°” (x, &)’s in terms of the characteristic and 
associated functions we shall need to know some of their properties. 


LEMMA 4.2. There is a positive number u, such that 
(4.2) Go? (x, &)| < Ke" a = 1,2,...7% 
For fixed & Ga‘” (x, &) belongs to the domain of L and 


40 a= 


’; 
~( J) ‘ 
(Gi? (x, £) oy > a A! 


(4.3) (L — r,)Gi? (x, €) = 








, €) 


and 





eee 





NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 653 
For fixed x, Ga” (x, &) = G*."” (&, x) belongs to the domain of L* and 
(44) (L*—%,)G2(x, &) = Tao a 

— Gaii(x,~) a@a=1,2,...,7,—1. 

Proof. If we note that for each \, there is a circle C, surrounding it and 
no other characteristic values, which lies completely in H3,2; we see that 

(p) 1 \a—1 
G, (x, §) = =— (A — Aj)” G(x, &, A)dA. 
Cj 
We have noted that for points such as A on C, 
IG(x, &,A)| < K exp [— ulx — &] 


where u is less than the minimum absolute value of the imaginary parts of 
the solutions of p(u) = A. As these must have a positive minimum on C, 
we take u, equal to one-half of that and |G(x, &,A)| < K exp [—u,|x — €] 
for all \ on C,. Thus if the radius of C, is r, we have 


GP (x, | << | o°K expl—uylx — el]r de 
Qa 0 


= K ri exp[—u,|x — &]], 
which proves (4.2). 
If To(x, &, A) is defined to be the entry in the upper right-hand corner of 
the matrix 


_ 30 E>x 
r(x, g, d) —= \(-i)" M é-® yy x <£, 
then I’) has the same discontinuity at x = & as G(x, &, A), and is analytic in 


a neighbourhood of A, which includes C,. Thus G(x, —, A) — To(x, &, A) belongs 
to C" and so must 


G.” (x, §) = sf (X — A,)*"[G(x, &, 4) — To(x, & d)]dd. 
= Cj 


From this we easily obtain (4.3), and (4.4) follows in a similar manner. 

In order to obtain an expression for G,” (x, —) we consider the L? case 
separately. It is known that in the neighbourhood of an isolated pole A, of 
the resolvent R, = (LZ — AJ)~—' can be expanded in the form 





rj-1 rj—2 
Ry as sleet ab = on —— = = Po is - + 
(A—A,)% (A—a) “"° CAmA)M AKA’ 
where (L — A,)P; = P/(L—)A,;) = Q; P? = Py, PQ; = QP; = Q; and 
Q,"7 = 0. Let Py» be the orthogonal projection on the orthogonal complement 


(I1—P,$ and so 2€ (IJ — Py), which implies that 2=0. Since 
(I — Pj) UI — Py)z = UI — Pyo)z for any z € § we have 


of (I — P;))S (where § = L?). Now if Pyz =z and Py =0 then z 











654 R. R. D. KEMP 


z= P 502 i UT = P o)2 
P (Pt) + (UI — P;)[Pyp2z + UI — Py)s] 
P+ (I — P;)z 


ll 


for any z € §. These two latter expressions both give z as the sum of a vector 
in P, plus one in (J — P,)§, and as this is unique P,(Pyoz) = Pz. Similarly 
P »(P 2) = Pyz for all z € § and so P; maps PH one to one onto P,§, 
and its inverse is Py. If {e,} is an orthonormal basis of Py then for any 


z€ 
Pz = P,P 32 = P; > (2, Zp )ex = , (Z, ex) P xx, 
k k 
from which it follows that 


Pz = rs (z, Pex )ex 


k 
and (Pex, ¢:) = dx. Thus Py» = P*,§ can be identified with the dual space 
of P,. Since every vector z € P,$ which satisfies Q,z = 0 is a characteristic 
function of LZ corresponding to the characteristic value \,, the dimension of 
the null-space of Q, restricted to P,;$ must be finite (< n/2). Using this and 
the fact that Q,") = 0, a simple induction shows that dim P,§ is less than 
or equal to (r, — 1)". Thus we may now choose a basis of P,© consisting of 


Xj, Opp, ..-, Qi xn; 
X52, Ox j2, eee Q5’°x 323 


- - 5 IP san . 
X pis OX p;,--- » Qi 1X jpjs 
where 
r,-l=spr>sp>...> S jp,- 
It is easy to see that the dual basis for P*,S may be written in the form 
* ** Sen * 
Xj, Opy,.--, 0," x43 
* ** Pass 0 
X j2, Q X52 pooces Q; X52; 
* ** * * 
- - 5IPi nn ° 
X jpjs Ox jpjy- +s Q; X jgjs 


where (Q,'x jx, Q* j'x* ym) = 5:41, ,5em- Thus 


E *s.,-8 & 
Qix = Q;P x - Q; yi 9 (x, Q; * X ja) OX ja 
a= B=0 
Pj sja—t 
* 4 * 
-¥ Fe oeenete. 
a=1 = 


Now each of these basis elements is a function, and we see that P, and 
Q,'(i = 1,2,...,7;— 1) are all integral operators, with kernels — G,” 
(x, £) and — G,4, (x, &) respectively. This yields expressions for G,‘” (x, £) 




















rm 


ind 
' (J) 
1 


, §) 




















NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 655 


for which we shall make some changes in notation. Let x;,;‘” (x) be the function 
represented by Q,'#-'x, and x*,,“” (x) be the function represented by 
Q* ,**~'x* ,. Then we may summarize the preceding results in the following 
theorem. 


THEOREM 4.1. Using residue methods we obtain 


(4.5) Is.n = G(x, ) — > e Gs” (x, €)(A — d,)™*. 
j=l a=! 
For each j(j = 1,2,..., p) there are numbers q, and 
5 jqj < Sjqj-1 ~o Vs" 1; 


and functions xxi‘ (x), x*ei (x) (Rk = 1,...,¢),,1 =0,1,2,..., 5) such 
that Xe? (x) S a x* ey? (x) t Dix.» for all Pp, 


(j) r  —uyiz * r —eiiz 
xii (x)| < Ke ,lxer?(x)| < Ke . 
0 l=0 
a ) (yj) = J 
(L Ns) xe xi? J. | Soe = 
* +) _ JO i=Q@ 
(L i j) Xx ' xn it =1.2.. iin 
P (j) *(;’) 
Xei (X) Xe i (x )dx = 8 5 iB xxb v4 sik 
Also 
Pi Sja~? — 
G(x, e) = — DD xb (x) x07 -v-0(€), 
a=! B=0 


where a sum from zero to a negative integer is zero. 


Proof. Relation (4.5) follows from Lemma 4.1. The estimates of the 
x’s and x*’s follow from (4.2) and their linear independence, while the 
other formulae follow from the construction of the x‘”’s and x*‘”’s, and from 
well-known properties of characteristic and associated functions. 

We must now proceed to the evaluation of J; » directly. From Theorem 4.1 
I;,_z is independent of 6 and R, and we wish the limit of the direct evaluation 
in terms of line integrals as 6+ 0 and R— ©. We shall denote the portion 
of J;.» which arises from the integral over arcs of the circle |A| = R by 


f G(x, &, w)/(u — A)dz. 
|A|=R 


LEMMA 4.3. 


Roo V\\imr B— A 


uniformly in 6 for 6 > 0, provided x # E. 











656 R. R. D. KEMP 


»= a exp] 24] (1 + O(a") 


for large |A| we see that the function u(A) in |G(x, &,A)| < K exp [— u(d) 
|x — &|] approaches zero as Re“ approaches Ree = (to) in the same manner 
as 


Proof. As 


R's = ° 
Thus we may divide one of these arcs of |\{ = R (from 6; to 62 say) by 6,’ 
and 6,’ such that 
IK exp] —cR"™ snl 2 = %| lx —£ Jo, <6< 6; 
|G(x, &, Re)| < | K exp[—cR™"|x — |] << 0< 45 


K exp] —cR’ . sin 2 — 8] x—€t |e: <6 < Os. 


As there are two of these arcs we have 


. | 
fe G(x, & 4) 9, | 
ier MA ) 








9k —SESE aE es __ 
< 2K ‘ R-a R d@ 
or ‘exp[—cR™” — |) 
+ 2K f R- I R dé 
6s’ exp| —cR™ sin as x — &| 
oK ee. Se 
+ 2K * Roh R dé 
<K exp[—cR™" sin @ |x — t|] d@ (R > 2|A\) 
ox /2 2 l/n 
<K{ oxy - =K—¢. «- tla 
Kr 


< OR — §)’ 


where K has been increased as necessary. 


Thus in the limit we need only consider the integrals along the boundary 


of H;.z as R— © and 6 —0. The arcs have the form 


bp" (t) 


d= pit et) 














wl 


(4 


an 


an 


ury 

















NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 657 


and so may be written in the limit as 6 —~ 0 


f G(x, —&, ut) — G(x, &, u—) 
w=p(t) u—d 





du, 


where G(x, &, 4 +) is the limit of the Green’s function from one side, and 
G(x, &, 4 —) is the limit from the other side. Our assumption that W (A) has 
no zeros in \ = p(t) implies that these limits exist and are continuous, so 
the limit of the integral exists as long as R is finite. We must now evaluate 
G(x, —&, u +) — G(x, £,4 —) to show that it exists when R= @ as well. 
First note that they are uniformly bounded in x, — and yw, and have the same 
discontinuity in the (m — 1)st derivative at x = ¢ This implies that as 
functions of x they belong to C" and satisfy (L — u)y = 0. It is easy to see 
that this equation has precisely as many linearly independent uniformly 
bounded solutions as p(t) = yw has real solutions (unless yw is an eigenvalue, 
which we have excluded). Let this number be o(u), where » = p(t), and 
denote the linearly independent bounded solutions by x:(x, u), x2(x, u),.. 
Xe(u) (x, w). Then it is obvious that 


** 


cn plete dee 
G(x, &,u+) — G(x, t,u—) = 2x > x (x, m) x5 (E, @). 


j=l 


We see also that (L* — yu)x*,;(x, 4) = 0 and that x*,(x,u) is bounded. In 
fact, if we investigate the asymptotic behaviour of G(x, —, u +) — G(x, &,u —) 
for large values of u we find that it behaves like 


> ; eti® §) 
j=l p (v;) 


where o(u) is 1 or 2 and »; (and v2) are the real solutions of p(v) = uw. Thus 
the integral over the portion of \ = p(t) which extends to infinity will be 


less than or equal to 
. dt 
Ps 
¥ ty p(t) —X 


which clearly exists for > 2. 


THEOREM 4.2. 


dD rj 
(4.6) Gix,tra= >> Do GPx, Ha-aA)™ 


j=l a=1 a 
o(m) * - 
x(x, w) x,(&, Z) 
+f Sree 4, 
pep(t) j=l aces A 
Proof. This follows immediately from the evaluation of I;,z as 6-0 
and R— @ above, and from the previous evaluation by residues in (4.5). 
We may immediately obtain a corresponding expansion of G*(x, &, A), 
and also the following expansion of a certain class of functions. 


THEOREM 4.3. Jf f € Dia, g © Du. then 














658 R. R. D. KEMP 


Ae) -{ f(x) x*(x, X)dx and 2; (A) -f« g(x) x(x, d)dx 


exist and are integrable along \ = p(t) and X = p*(t) respectively. Also 


Dp sj 
(4.7) f(x) = Db > > xan (x) (f, ee 
j=l a=1 B=0 
@(A) 
+f SS xs ahora, 
h=p(t) j=l 
D @ 5ja 
g(x) = ey 2 4 D> xas” (x) (g, ea 
. @(A) 
+ DS x5(x, 085 


h=p*(1) j=l 


There is also an analogue of the Parseval equality: 


(4.8) (f,g) = > > x ( semaine 


@(A) a 
+ DX frye (aa 
h=p(t) j=l 


and one may obtain a slightly different expression by interchanging f and g, 
taking conjugates, and changing the variable in the integral to } = p*(t) 


Proof. The existence of FFA) and g*,(A) is obvious, and the expansions 
(4.7) follow from the identities 


f=(WZ-)d) fe, E, A)f(E)dE, g 


Il 


(L* — r) | G* (x, &, A)g(E)dE 


exactly as in (4). Relation (4.8) follows when we note that f € D,.; implies 
f € L* as well. 
Formally we define the spectral resolution E(M) in terms of the kernel 


E(x, ;M) = — 2, Gi", gf > x(x, w) x5 &, B) dy. 
€ Pi) j=1 


It is easy to see that this exists for any bounded Borel set M, and that for any 
f€ Dr, E(M/f = f__E(x, &; M)f(&)dé exists. Thus for any bounded Borel 
set E(M) is a closed, densely defined operator. Also, for any such 
f, E(M)f € Dz», and if E(M)Lf exists it is equal to LE(M)f, so L and E(M) 
commute in a certain sense. It is also clear thatif M (\ N = ¢,then E(M\W N) 
= E(M) + E(N). It is not quite so clear that E(M)E(N) = E(M C/N), 
but if we note that for \ = p(t), xm(x, A) € Dz..., we see that x,(x, A) = 
(A - Ao) f__ G(x, £, Xo) xm(é, A)dé. It is easy to see that f_.G¢ (x, £)Xm 
(€, \)\d=é = 0 so that 


xm(s, 0) = (A — do) f fo > sew xi EB)» & r)dudt, 
—o Yu=p(t) j 2d, Be — Xo 








g, 


lies 


‘nel 











NON-SELF-ADJOINT DIFFERENTIAL OPERATORS 659 


and for g € Dy. 


ape (ff SA=“a0 .enseo 
BO = ff DAS BW ale IE waa 


j=l Bm 


; inne i. o(u) = Re -_o 
= lim ar, B® | Xm(E, A) x7 (E, A)dEdp. 

Ry. Rese Yp=—p(t) B Xo j=l —R; 

Thus we see that the distribution fo? xm (E, d)x*,(&, @)dé is zero unless 7 = m 
in which case it is 6(u — A). Thus the kernel f_..°E(x, 2; M)E(z, &, N)dz 
of E(M)E(N) is easily shown to be equal to E(x, §; M (\N), and we have 
E(M)E(N) = E(M (VN). 

In some cases this resolution will consist of bounded projections, and may 
even be uniformly bounded. In this case it will be a spectral measure in the 
sense of Dunford (3) if it is also countably additive. However, this will not 
be true in general, for if the perturbing operator L, is absent E(M) is a simple 
function of the operator with kernel 4fuqrx explio(x — £)]de. This gives a 
self-adjoint spectral measure on L?, but Rota (7) has pointed out that it is 
unbounded on any other L? space. 


REFERENCES 


1. M. S. Brodskii and M. S. Liv3ic, Spectral analysis of non-self-adjoint operators and inter- 
mediate systems, Usp. Mat. Nauk. (N. S.), 13, no. 1 (79) (1958), 3-85. 
2. E. A. Coddington and N. Levinson, Theory of ordinary differential equations (New York, 
1955). 
. N. Dunford, Spectral operators, Pac. J. Math., 4 (1954), 321-354. 
- R.R. D, Kemp, A singular boundary value problem for a non-self-adjoint differential operator, 
Can. J. Math., 10 (1958), 447-462. 
5 M. S. LivSic, On spectral decomposition of linear non-self-adjoint operators, Amer. Math. 
Soc. Translations (2), 5 (1957), 67-114. 
6. M. A. Naimark, Investigation of the spectrum and expansion in eigenfunctions of a non-self- 
adjoint differential operator of the second order on a semi-axis, Trudy Moskov Mat. Obsc, 
3 (1954), 181-270. 
- G. C. Rota, Extension theory of differential operators I, Comm. Pure and App. Math., 11 
(1958), 23-66. 
8. J. Schwartz, Perturbations of spectral operators, and applications I, bounded perturbations, 
Pac. J. Math., 4 (1954), 415-458. 


~~ wo 


“I 


Queen's University 











THE GIBBS PHENOMENON FOR TAYLOR MEANS 
AND FOR [F, d,) MEANS 


CHESTER L. MIRACLE 


1. Introduction. The Gibbs phenomenon may be described, quite 
generally, as follows. Let a sequence {f,(x)} (m = 0,1,2,...,) converge to 
a function f(x) for x in the interval x9 < x < x9 +h. We say that {f,(x)} 
displays the Gibbs phenomenon in a right-hand neighbourhood of the point 
Xo, if 

lim f,(x) > f(xo +0), or lim f,(x) < f(xo + 0). 


aa na 
zz01 0 z~z0+0 


A similar definition holds for a left-hand neighbourhood. If {f,(x)} displays 
the Gibbs phenomenon at both sides of xo, we say simply that {f, (x)} displays 
Gibbs phenomenon at the point xo. We define the Gibbs set of the sequence 
{f,(x)} at the point xo to be the union of all numbers 7 such that f,(x) — 7 as 
n— © and x — x» through appropiate values. Here we will be concerned not 
with the Gibbs phenomenon in general, but with the Gibbs phenomenon as 
displayed by the sequence of partial sums of a Fourier series. Further, we will 
restrict ourselves to Fourier series representing functions which satisfy the 
Dirichlet conditions. 

The following is a description of the Gibbs phenomenon for the case we 
will consider. Suppose the function f(x) satisfies the Dirichlet conditions in 
the interval —x < x < z, and suppose a is a discontinuity of the function 
f(x). Let {s,(x)} denote the sequence of partial sums of the Fourier series for 
f(x), then by proper choice of a sequence {t,}, which approaches a as n 
approaches infinity, we can make the sequence {s,(¢,)} approach any number 
in the closed interval whose endpoints are 


I and 





_ a , fla +0) — f(a — 0) 
(1.1) f(a —0) + = 


fla+0)- f(a + 0) — f(a — 0) I 


[= [a = — 28.... 


T 


where 


Received October 26, 1959. This paper is a condensation of a portion of the author's 
dissertation written at the University of Kentucky under the direction of Professor V. F. 
Cowling. 


660 








thor’s 





THE GIBBS PHENOMENON FOR MEANS 661 


A proof that this conclusion follows from the above hypothesis can be found 
in Bécher (3) or Carslaw (4). Note that for our case the Gibbs set at the 
point a is composed of all points having abscissa a and ordinates in the closed 
interval whose endpoints are given by (1.1). 

Let A = (a,x) and {s,} (m, k, = 0,1, 2,3,...,) be a matrix and a sequence 
of complex numbers, respectively. Let the members of the sequence {¢,} be 
defined by 


« 


—— Pa AnkSk, 


k=0 
then we say {¢,} is the A transform of {s,}. The matrix A = (a,,) is called 
regular if 


lim s, = lim o,, 


n+co nc 


whenever the first limit exists. Necessary and sufficient conditions in order 


that a matrix A = (a,,) be regular are the well known Silverman-Toeplitz 
conditions: 
(1.2) > |an| <K (n = 0,1,2,...,), 
k=O 
(1.3) lim >> au = 1, 
naam k=0 
(1.4) lim a4, = 0 (& = 0,1,2,...,), 


noo 


where K is a constant independent of n. 

If f(x) is a function satisfying the Dirichlet conditions and {s,(x)} is the 
sequence of partial sums of the Fourier series for f(x), then it is well-known 
that for a given x the sequence {s,(x)} approaches 


f(x +0) + f(x — 0) 


9 


as n approaches infinity. If we transform the sequence | s, (x)} into the sequence 
| R,(x)} by a regular sequence to sequence matrix 4, then it follows that for 
given x the sequence {R,(x)} also approaches 


f(x + 0) + f(x — 0) 
2 
as m approaches infinity. A question then presents itself. Does the sequence 
|R,(x)} also display the Gibbs phenomenon at every finite discontinuity of 


f(x)? This question has been studied for Césaro means by Cramer (7) and 


Gronwall (8), for Euler means by Szasz (13), for Borel means by Lorch (12), 
for Hausdorff means by Szasz (14), and for Riesz means by Kuttner (10), 
to cite only a few cases. Our purpose is to study this question for the Taylor 
matrix and the [F, d,] matrix. 











662 CHESTER L. MIRACLE 


DEFINITION (1.1). Let f(x) denote any function satisfying the Dirichlet 
conditions and having a discontinuity at the point a. Let {S,(x)} denote the sequence 
of partial sums of the Fourier series representing f(x). Let {R,(x)} denote the A 
transform of {S,(x)}. If for every such function f(x) the sequence { R,(x)} displays 
the Gibbs phenomenon at the point a and has the same Gibbs set at the point a as 
does {S,(x)}, we say that the A transform completely preserves the Gibbs pheno- 
menon for Fourier series. 


In some cases the sequence {R,(x)} displays the Gibbs phenomenon at 
x = a, but does not have the same Gibbs set at a as does the sequence {S,,(x)}. 
We use the word completely here to indicate that we are excluding such cases 
from our consideration. In such cases one might simply say that the A trans- 
form preserves the Gibbs phenomenon for Fourier series. 

A short calculation shows that the first number in (1.1) can be written as 





tO tie <9 , 12+ -fe-© f le my a, 


< Tr 


and the second number as 


(S++ fe =O , +O —fe-9) Foz, 





Tr 


Since 


is a continuous function of r, it follows that any number in the interval whose 
end points are given by (1.1) can be written in the form 
fia+0)+f@—-— 90) , fa@+0) —f(a-—0) ('siny 
cen + ——i dy 
2 T o y 





(1.2) 4 
by proper choice of r in the interval —x < +r < x. Hence we have the following 
theorem. 


THEOREM (1.1). Let f(x), A, a, {S,(x)}, and {R,(x)} denote the same quanti- 
ties as in Definition (1.1). If for each r in —2 <1 < a, there is a sequence 
{t,}, with lim,,.. tf, = @ and so that 


lim R,(t,) = fet 9 tie —5 4 f(a + 0) — f(a — 0) {= Y dy. 

he ~ us Jo Y 
then the A transform completely preserves the Gibbs phenomenon for Fourier 
series. 


2. Taylor and [F,d,| transforms. The elements a,, of the Taylor 
matrix 7, are defined by the relation 


n+lpn « 
CaS af (etl <P. 


9 +. 3 
(2.1) (1 — ro)"*" —_ 











WwW 

















THE GIBBS PHENOMENON FOR MEANS 663 


It is shown by Cowling (5) that the Taylor matrix satisfies the Silverman- 
Toeplitz conditions (1.2), (1.3), and (1.4), and thus is regular if and only if 
0 <r < 1. Ashort history of the Taylor matrix, and a list of the basic papers 
concerning it are to be found in a paper by Cowling and Piranian (6). 

The elements P,, of the [F, d,] matrix are defined by the relation 


2.2) P w= 1 


n r) +d, ee 
———t Pf, &3>0G@=1,2,3,...,). 
IT 1+d, 2 k > (n ) 
This matrix was studied by Jakimovski (9), who shows that it is regular if 
and only if 


= d,' = + @, 
n=1 


The [F, d,] matrix is a generalization of the Euler and Lototsky matrices. If 
we let d, = n — 1 in the [F,d,] matrix, we get a matrix whose elements a,, 
are given by 


n 


06+ 1)(64+2)...@+n-—1) = > nla,6. 
k=0 
This is the matrix of Lototsky (11). It is shown by Lototsky and Agnew (2) 
that this marix is regular. If we let d, = (1 — r)/r in the [F, d,] matrix, we 
get a matrix whose elements a,, are given by 


n 


[ré+ (1-—r)l' = > AnD. 


This is the well-known Euler matrix. It is shown by Agnew (1) that this matrix 
is regular for 0 <r < 1. 


3. Preliminary theorem. We first state the following lemma. 


LemMMA (3.1). Suppose {S,(x)} is a sequence which approaches the function 
f(x) uniformly in the interval a < x < b. Further, suppose there exists constants 
M and M,(n = 0,1,2,...,) such that |f(x)| < M and |S,(x)| < M, for all x 
in the interval a <x <b. If {R,(x)} denotes the transform of the sequence 
{S,(x)} by any regular sequence to sequence transform A, then the sequence 
{R,(x)} approaches f(x) uniformly fora <x < b. 


From the manner in which this lemma is stated, the reader can easily 
construct the proof using the Silverman-Toeplitz conditions (1.2), (1.3), and 
(1.4). 


THEOREM (3.1). Let A = (@,x) denote a regular sequence to sequence matrix. 
Define the function $(x) by 


_ f—x/2 —mr<x<0O 
(3.1) o(x) = 4 «/2 O<x<-f, 


o(—7) = (0) = (7) = 0, and o(x) = o(x + 2n). 











664 CHESTER L. MIRACLE 


Let {s,(x)} denote the sequence of partial sums of the Fourier series for (x). Let 
{o,(x)} denote the A transform of {s,(x)}. Now, if {o,(x)} displays the Gibbs 
phenomenon at zero and has the same Gibbs set as {s,(x)} at zero, then the A 
transform completely preserves the Gibbs phenomenon for Fourier series. 


Proof. Let f(x) be any function with period 2x satisfying the Dirichlet 
conditions in the interval —x < x < wand having a discontinuity at the point 
a. Since f(a) does not effect the Fourier series for f(x), let us define f(a) by 
f(a) = H{f(@ + 0) + f(e — 0)}. Since f(x) satisfies Dirichlet’s conditions, it 
can be represented by a Fourier series whose sequence of partial sums will 
be denoted by {S, (x)}. 

Define the function ¥(x) by 


f@+0)+f@-90) f@+0)—f(@-—9) 
2 


us 





(3.2) W(x) = f(x) - o(x — a). 


Since ¥(x) satisfies Dirichlet’s conditions, it can be represented by a Fourier 
series, whose sequence of partial sums will be denoted by {f,(x)}. 
A short computation shows that ¢(x — a) has the Fourier series expansion 


> fi- (-1) 7 ae sin vx — == cos ve , 


o=1 





Let {s,(x — a)} denote the sequence of partial sums of this Fourier series. 
If we replace f(x) by its value from (3.2), compute the coefficients of its 
Fourier series expansion, and then sum from 1 to n, we get 


(3.3) S,(x) = f(x) 
+1240 +fe-0) + {e+ ~ fe - =o}, _—— 








Let {R,(x)}, {Wa(x)}, and {o,(x — a)} denote, respectively, the A transforms 
of the sequences {S,(x)}, {f,(x)}, and {s,(x — a)}. Then applying the A 
transform to the sequence in both sides of (3.3), we obtain 

(3.4) R, (x) = W,, (x) 


+ fet Ffe~ 9) 5 ie , +-let 9 —fe - 


T 








o,(x — a). 


From (3.2), ¥V(a + 0) = ¥(a — 0) = V(a) = 0, and so V(x) is continuous 
at x = a. Since V(x) satisfies the Dirichlet conditions, there exist numbers a 
and 6 (a < a < 8) such that ¥(x) is continuous for a < x < 8. From Lemma 
(3.1) it follows that { ¥,(x)} approaches W(x) uniformly for a < x < 8. Hence 
given « > 0, there exist an integer m) and a number 6 such that if m > mo and 
lx — a| < 6, then 
(3.5) |W, (x)| <«. 


By assumption {¢,(x — a)} displays the Gibbs phenomenon at x = a and 
has the same Gibbs set at x = a as does {s,(x)}. Going back to (1.1), this 





ee. ee 

















THE GIBBS PHENOMENON FOR MEANS 665 


means that by proper choice of a sequence {t,}, such that lim,.,.. 4, = a, we 
can make the sequence {¢,(¢, — a)} approach any number in an interval whose 
endpoints are 


¢(0—) + 20H) = $0-) ("2 ay ‘. f =2 dy 


T 





and 





404+) — HOt) — 90=) f= dy = f= dy. 
T ha y 0 y 


Let r (—2 <r < a) and « > O be given. It now follows that there exists an 
integer N, and a sequence {t,}, with lim ,... 4, = a, such that if m > N,, then 


_,)— {sin | : en 
tale — 2) J, y ™—* 3if(a + 0) — f(a — 0)|° 
From (3.5) there exists an integer V2 such that if m > No, then 
|W, (te)| < Fe. 

From (1.3) there exists an integer NV; such that if m > N3, then 
= 2e 

wale " 
2, 3\f(a + 0) + f(a — 0)| 
Rearranging (3.4), inserting absolute values, and replacing x by ¢, yields 


Ra(ts) — |i + 0) i — 0) +L + 0) — f(a — % f's02 ay| 


T 























< |¥a)| + 








fia+0)+fa- | 
2 


+ [fa + 0) — f(a ~ 0)| 


T 





F an —1| 


k=0 








Let N = max (Nj, No, N3), then for nm > N 


R,) - [i + 0) + fle = 0) , f@+0) ~ f(a - 0) fe ay || me 


Tr 








The theorem now follows from an application of Theorem (2.1). 


4. The two main theorems. 


THEOREM (4.1). The Taylor transform completely preserves the Gibbs 
bhenomenon for Fourier series. 


Proof. A short computation shows that the function ¢(x) given by (3.1) 
has the Fourier series expansion 
9 > sin(2v — L)x 


v=} 2v — 1 











666 CHESTER L. MIRACLE 


Let {s,(x)} denote the sequence of partial sums of this series, then another 


short computation shows that 


atin el {= 2nt dt 
—_- wa ene 


In the following discussion we consider only values of x in the interval 
0 < x < }. The Taylor transform {¢,(x)} of the sequence }s,(x)} is given by 


- — a = n+1 (*) — n Sin 2kt 2kt 
(4.1) On (x) >: f (1 r) be r sin ? dt. 
Since 0 < ¢ < 4x, we have 
9) 
sin 2kt é ob. 
sin f 
Hence, 
— yeti ?- n SiN 2kt = (*) n+1 k—n (n + r) 
: < ae mp OF 
¥ (*)a-, sin t D> n (i i (l-r)' 








or the series (4.1) is uniformly convergent for 0 < t < 2/2. Therefore, we 
may interchange the order of integration and summation in (4.1) which gives 


us 
ts 
o,(x) = et... Ds (*) r* sin Qkt dt 
k=n 


0 sin f n 


er = ' 
-Jz sin? my > (: Ja ™ rye hm ete dt. 


Using (2.1) with @ = e?** to sum the ear, we get 


"(1 - pa r)"* j emt ' 
9 4 = — “475-1 
(4.2) On(X) 4 =a “Im)— ety | dt. 


Define p and @ by the relation 


(4.3) pe = 1 — re" 
From (4.3) it follows that 

(4.4a) pcos @ = 1 — rcos 2zt, 
(4.4b) psin @ = rsin 2t, 
(4.4c) p> = 1 — 2rcos 2t + 7’, 
(4.4d) 0<6< (r/2), and 
(4.4e) (l-—r) <p 


Substituting the left-hand member of (4.3) for the right-hand member of (4.3) 
in (4.2), we obtain 


ar _ n+1 
(4.5) on(x) = J gins (2: 3} sin[(m + 1)@ + 2n#] dt. 


o sin ft p 














th 














THE GIBBS PHENOMENON FOR ME 
Using (4.4c), it follows that 


2 
1— (ime — r) = _ sin’t. 
p p 


Since r < 1 and p > 0, (4.4c) implies 0 < (1 — r) 
that 


(4.6) 0 < 1 -{4=2) < | = (t=) < 
p p 


ANS 


/p <1. 


n+1 2 
(4.7) (A= 4 et~as (m + I)rt" . 


p (1 —r)° 


where 0 < uw < 4. Note that uw is a function of n, r, and t. Substituting the 


value of [(1 — r)/p]"*' from (4.7) into (4.5), we get 
alt 9 
(4.8) o,(x) = | sal (s molt) + ant] dt — 
Jo sin ft 


je + 1)rt’ sin{(m + 1)6 + 2nt] x 
_ ee - a a. 


) (1 — 
=J+ TI’. 


Making use of the well-known inequality 0 < x — sinx < x’, valid for 


x > 0, we obtain the inequality 


pd — 2rt| < p(@ — sin 6) + r(2t — sin 2t) < pé*® + Bri’. 


r) sint 


It follows from (4.6) that (9 + r — 1) < (4rt?/p). Hence 


(l — r) — 2rt| < |p@ — 2rii + (r+ p-1Ldé< 
From (4.4b) and (4.4d), it follows that 
6 < (rr/2p) sin 2¢ < (art/p) 
Therefore, 
(l—r) —2rt\| < = be Birt? + 


Using (4.4e), this inequality becomes 


| Ort | z ert Srt? 


pb*® + Sri? + (4rt?6/p). 


Hence we have 











668 CHESTER L. MIRACLE 
Hence we have that 
9 : 
(4.9) 6=— +m! 
where 
IAL < a-7 og ¢ aa < a 3. 


Note that A is a function of r and ¢. Upon replacing @ by the right-hand member 
of (4.9), the integral J in (4.8) becomes 


zs 3 os . 3 
a f= L,t cos[(m + 1)M lay + f ost sin[(m + 1)M | ot, 
0 0 


sin ¢ sin t 








where 


Putting this value of J into (4.8) and adding 
- J t~" sin L,,t dt 
0 


to both sides of (4.8), we get 











(4.10) ea) - f wba - 
0 
r(n+1) (* »ssin{[(m + 1)@ + 2nt] 
—- ——3 Ht 
(1 — r) 0 sin t 
z. " 3 
4 f cos Lart sin[(m + 1)d'] at 
0 sin ft 
"in L441 — colin + 1m") 
f sin L,,t ha me ( dt 
=7,+72.+ T3. 
Since sin t > (2t/r) for 0 < t < 4x, we have that 
r(n + 1) fe r(n + 1) xx" 
T| (l—r)°Jo 2¢ o <3 "G—F 
and 
r "aM" 5 9a(n + 1)x" 
IT < (n+) f Mc Meethe 


After expanding sin ¢ and ¢ cos [(m + 1)Aé*] in series, it follows from a well- 


known theorem for convergent alternating series that 


|sin ¢ — tcos [(m + 1)A#*]| < # + (mn + 1)? A? 2. 











Ar 


re 





Ww! 











THE GIBBS PHENOMENON FOR MEANS 669 


Applying this inequality, we get 


inl< J rt + (n+ Ir) yc 4 U3a(n + 1)'x! 
2t* (l—r)° 








Inserting absolute values on both sides of (4.10) and using the above inequali- 
ties for |7;|, | T2|, and |73|, we have 





“sin La rt 
= dt 


: ae 





(4.11) 


on (x) _— 


rx(n + 1)x* 9x(n + 1)x" 2 243x(n + 1)*x° 
< ar pei. i er - pee eel 
(1 —r)° (1 —r)° (l —r) 
Let + such that 0 < +r <7 be given, and define the sequence {t,} by 
t, = r/L,,, then for m > 1 we have ¢t, < r/2. Therefore, if > 1, we may 
replace x by ¢, in (4.11). This gives 


0,(t,) — f nn iad a| 


rxm(n+1)r° , Oxr(n+ 1)r* 243 2r(n + 1) 
< i-vnLt ai —r) ZL, + fr + (1 —r)L° 





where r(0 < r < 1) is fixed. Upon setting y = L,,t, it follows that given 
¢ > 0 there exists an integer NV (V > 1) such that if » > N, then 


* sin y 
On (ty) -f —d ay| <«. 
0 y 


Let « > 0 and 7 such that —2z < r < O be given. Since —r is in the interval 





0 < — +7 <7, we have just shown the existence of a sequence { —#,}, with 
—t, = —2/L,,, and an integer N such that if m > NV, then 





; *~* sin y 
on(—tn) — j sr dy| <e 
«0 y 


It follows from (4.1) that o,(x) = — o,(x). Substituting —<o,(é,) for o,(—t,) 
and —y for y, this inequality becomes 





** sin y | 
On(tnh) — j on < €. 


Since 


o(0+) + o(0—) (0+) — o(0—) 
—_ ; =Q and -——— =I, 


2 T 


it now follows from (1.2) that {¢,(x)} displays the Gibbs phenomenon at 

= 0. Note we have also shown that {c,(x)} has the same Gibbs set at zero 
as does the sequence {s,(x)}. Theorem (4.1) now follows from an application 
of Theorem (3.1). 














670 CHESTER L. MIRACLE 


THEOREM (4.2). The [F,d,] transform completely preserves the Gibbs 
phenomenon for Fourier series. 


Proof. Let {s,(x)} denote the sequence of partial sums of the Fourier 
series representing the function ¢(x) as given by (3.1). As in Theorem (4.1), 


"* sin 2nt 
S,(x) = em 
sn (x) J, sin f . 


In the following discussion we consider only values of x in the interval 
0 <x < r/4. Let {o,(x)} denote that [F, d,] transform of {s,(x)}, then 


2 = ™* sin 2kt 
(4.12) ox(e) = 2 J =a dt 


| 
he 
~ 

2 

~ 


-{ | > P,, sin 2ke dt, 


9 sint k=0 





where the numbers P,, are defined by (2.2). Using (2.2), we can write the 


last sum in (4.12) as | 








s 2it 
> Pa Im e** = Im) > Py ett = Im’ I] (: +1) 


k=0 k=0 l + d; 


Replacing the sum in the last member of (4.12) by this product, we have 


(4.13) aid = f im Ti (2 < +a) 








Let us define p,; and @; (j = 1, 2,3, ... ,) by 
(4.14) p e's = e?** + d,, 
From (4.14) it follows that 
(4.15a) p,; cos 8; = cos 2t + d,, 
(4.15b) p,; sin 6; = sin 2t, 
(4.15c) p; = 1 + 2d, cos 2t + d’, 
(4.15d) pe, <1+d,;, 
(4.15e) 0 < 6; < 2t < w/2. 


Substituting the left-hand side of (4.14) for the right-hand side of (4.14) in } 
(4.13), we get 


7 — — Pj 2 
(4.16) o,(x) = fz a. i (. + ;,) sio( o,) dt 


From (4.15c) and (4.15d), it follows that 


__ py __ py ¥ _ 4d; sin’t _ 4dg? _ 4 
abet (#5) <1- (-#,)'-2 +4)! © +d)’ <ita, 





al 


he 








THE GIBBS PHENOMENON FOR MEANS 671 


or that 


(4.17) i= (_e) “ye sem 


Hence in view of (4.15d), we have 
0<1- (2) 
IT 1+ d; ; 


— ——— = _— a 
=(1 A) 4a (1- 2.) 


= a __Pn—1 ee Ea 
¥s oN T+ aT +a" Lf (1 ) 


<2 (-;t)< % 74 T+, 


j=l 


Therefore, we are able to write 


(4.18) i- (2) = \H,!’, 
IT l + d; 


where 0 < A < 4 and H, = >: (1 + d,)~'. Note that A is a function of n 
and ¢. Substituting in (4.16) the value of 


as given in (4.18), we get 


| csc t cin( s,) dt — H,, j M” ese t cin( 5 o,) dt 
0 j=l J j=l 


I+ I’. 


(4.19) o,(x) 


Il 


Il 


Since 6; < 4/2, the inequality (20,/r) < sin @,; holds. This plus the well- 
known inequality sinx < x, valid for x > 0, applied to (4.15b) yields the 
inequality 


4 < (wt/p)) (GG = 1,2,3,...,). 


Making use of the well-known inequality 0 < x — sinx < x’, valid for 
x > 0, and the inequality (4.17), we have 


a Ie __ Pj — [oe 9) 
6, is+d| “1 se (6; sin 6;) + i +2," sin 2t 
= © 
+ (1 ,) 0; 
6 8° 11°0, 
< a 4 


1+d, 1+d,'1+d, 











672 CHESTER L. MIRACLE 


Since we have just shown 6, < (xt/p,), it follows that 








2t rt 8e° 4nt° 
1-72 <T4a,ti+a,tI+a,’ 
for from (4.15c) p; > 1. Consequently, we may write 
9 3 
(4.20) 0, = ——— + Hf (j = 1,2,3,...,), 


l+d,'1+d, 


where |u| < r* + 8 + 4x < 54. Note that yu; is a function of d, and ¢. 
Summing both sides of (4.20) over j, we get 


> 0, = 2tH, + (unH,)t’, where (u.H,) = >> T+a, and | (upHH,)| < 54H,. 
j=l j=1 j 


Substituting for >." 6, in the integral J in (4.19), and. adding 


er 
- j t~* sin 2H,t dt 


0 


to both sides of (4.19), we have 


—— z fs n 
(4.21) e(2) — j eee do f Mig? set sin( 3° »,) dt 
70 70 j=] 


z 
+ f esc t sin(u,H,)t° cos 2H,t dt 
0 


- f sin 2Ht+ - £08 (uaHta)t =1,+72+T;. 
Jo t sin ¢ 





Since t < 2/4, csc t < w/2t. Hence 
T1| < (#/2)H, fra < cH,x’, 
and 


\T2| < (4/2) f (un, )t dt < 9xH,x’. 
ef 


After expanding sin ¢ and ¢ cos (u,/7,)é* in series, it follows from a well-known 
theorem for convergent alternating series that 


lsin t — tcos (u,H,)t] < # + (u,H,)?t". 


Applying this inequality and the inequality (2¢/r) < sint < #, valid for 
0 <t < x/4, we get 


IT3| < (#/2) f [t + (unH,)°t’] dt < x* + 2434H: x*. 


Now inserting absolute values on both sides of (4.21) and making use of the 
above inequalities for |7,|, |72|, and |73|, we have that 


(4.22) 





* si 2 9 9 9 9 
o,(x) — f sin Bet < Hyxx® + OeHax® + x* + 243 0H? x°. 
«0 








U 


tw 


an ww = 








THE GIBBS PHENOMENON FOR MEANS 673 


Let r such that 0 < r < = be given. Since },.; d,-' = + ©, it follows 
that H, —~ © as n— o~. Define the sequence {#,} by ¢, = 7/2H,, then for n 
greater than or equal to some fixed integer mo the numbers ¢, are in the interval 
0 < t, < 3. Replacing x by ¢, in (4.22), we have for m > mp that 


= sin 2H,t ar. Orr’ r 243xr° 
on (tn) -f t a < 4H, + gH? * a2 * GaH! 


Upon setting y = 2H,Jt,-it follows that given « > 0 there exists an integer 
N(N > mo) such that if m > N, then 


on(tn) -f sin ay| <e. 














The remainder of the proof of this theorem follows almost exactly the last 
two paragraphs in the proof of Theorem (4.1). 


REFERENCES 


- R. P. Agnew, Euler transformations, Amer. J. Math., 66 (1944), 318-338. 

———The Lototsky method for evaluation of series, Michigan Math. J., 4 (1957), 105-128. 

. M. Bécher, Introduction to the theory of Fourier's series, Ann. Math., 7 (1907), 81-152. 

. H.S. Carslaw, Theory of Fourier's series and integrals (London, 1930). 

. V. F. Cowling, Summability and analytic continuation, Proc. Amer. Math. Soc., 1 (1950), 

536-542. 

6. V. F. Cowling and G. Piranian, On the summability of ordinary Dirichlet series by Taylor 
methods, Mich. Math. J., 1 (1952), 72-78. 

7. H. Cramer, Etudes sur la sommation des séries de Fourier, Arkiv for Mathematik, Astronomi 
och Fysik, 13, no. 20 (1919), 1-21. 

8. T. H. Gronwall, Zur Gibbschen Ercheinung, Ann. Math. (2), 31 (1930), 233-240. 

9. A. Jakimovski, A generalization of the Lototsky method of summability, Mich. Math. J., 
6 (1959), 277-290. 

10. B. Kuttner, On the Gibbs phenomenon for Riesz means, J. London Math. Soc., 19 (1944), 
153-161. 

11. Lototsky, On a linear transformation of sequences and series, |vanor. Gos. Ped. Inst. Uc. 
Zap. Fig-Math. Nauki, 4 (1953), 61-91 (in Russian). 

12. Lee Lorch, The Gibbs phenomenon for Borel means, Proc. Amer. Math. Soc., 8 (1957), 
81-84. 

13. Otto Szasz, On the Gibbs phenomenon for Euler means, Acta Scientiarum Mathematicarum, 
12, Part b (1950), 107-111. 

14. ———— Gibbs phenomenon for Hausdorff means, Trans. Amer. Math. Soc., 69 (1950), 

440-456. 


an. wnr = 


University of Minnesota 











CONTINUOUS TRANSLATION OF HOLDER AND 
LIPSCHITZ FUNCTIONS 


H. MIRKIL 


All functions will be complex, periodic, integrable (on [0,22]) functions of a 
real variable x. Moreover, we shall require that every function have mean zero 
on [0, 2x], so that in particular non-zero constants are excluded. 


1. Plessner’s characterization of absolutely continuous functions. 
An old theorem of Plessner (4), generalized to arbitrary.compact groups by 
Bochner (1), can be taken as our starting point. Consider the functions f of 
bounded variation on [), 2x]. These f form a Banach space F when each f is 
normed by its total variation on [0,27]. And translations define a natural 
one-parameter group of isometries on F. But there exist f € F such that the 
translate 7,f, defined by (7.f)(y) = f(x + y), does not vary continuously 
with x. In fact Plessner proves that the mapping x — 7-,f is continuous 
precisely when f is absolutely continuous. These f constitute a closed invariant 
subspace © F of F, and the F norm can be reinterpreted on € F as 


ig f'(x)| dx. 


The above result suggests a whole class of concrete “‘Plessner problems”’: 
one for each (non-continuous) representation x — 7, of a locally compact 
group X on a Banach space F. Namely, to identify the f © F for which the 
mapping x — T,f is continuous. Or equivalently, to find the largest invariant 
subspace © F C F on which the representation x — T, is continuous. 

Here we shall study only a small subclass of Plessner problems, in which X 
is the reals mod 2z, and F is some translation-invariant Banach space of 
(complex, periodic, integrable) functions. Thus F will always be a subspace 
of L,[0, 2x]. We call such an F a translating space if its norm obeys, for each 
fe F, 

| oF 

J f(x) e* dx| < const ||f 

0 

(By 3.5 below, the F norm must then be larger than the LZ; norm.) The Plessner 
problem is now to find the set © F of all continuously translating functions in 
F, an f € F being said to translate continuously if the mapping x — 7,/ is 
continuous from X into the Banach space F. It is enough to know that this 
mapping is continuous at a single xo, or that the numerical function 

Received August 10, 1959. 

Research supported by U.S. Air Force Office of Scientific Research under Contract 
49( 638 )-294. 


674 











CONTINUOUS TRANSLATION OF FUNCTIONS 675 


x — ||7.f —f|| is continuous at 0, or that for each ¢ in the dual space F* 
the numerical function x — ¢(7-,f) is continuous. 

(The above choice of L; as “universal space” is more from convenience 
than necessity. For some purposes it might be useful to take the space M 
of measures, or even some larger space of Schwartz distributions. But L, 
is well suited to our present aims.) 

With many classical function spaces F, the identification of @ F is easy. 
For instance, let L, be the pth-power-integrable functions, let L,, be the bounded 
functions, let C be the continuous (periodic) functions, and let M be the 
Radon measures on [0, 2x]. Then for 1 < p < © we have @L, = L,; we 
have @L . = C; and we have @ M = L,. This last fact is simply the dif- 
ferentiated version of Plessner’s original theorem. 


2. Definition of Lipschitz spaces. The Hardy-Littlewood theorems 
for L,. In this paper we propose to solve the Plessner problem associated 
with various spaces of Hélder functions. We aim at an abstract Banach space 
formulation just general enough to include the Hardy-Littlewood results 
(3) on integration of L, spaces. Since the Lipschitz functions (Hélder functions 
of order a = 1) require special treatment, we shall take up this case separately 
first. 

Given a translating space F, with norm ||f\||, let us define for each f © F 
the associated Lipschitz seminorm 


(Since f is periodic, and 1/x is decreasing, the sup over x: 0 < x < © equals 
the sup over x: U < x < 2x. We define “seminorm” in such a way that it 
may take the value ©, but never the value 0 unless f = 0.) The f for which 
f\|® < © constitute a subspace of F, which we shall call A F, and ||f\|“ 
defines a genuine norm on AF. Hence we may say that AF is defined and 
normed by ||f\\"P? < @. 


Example 2.1. Let F be the space C of (periodic) continuous functions, 
with norm ||f\||, = sup |f(x)|. Then AC is the space of ordinary (periodic) 
Lipschitz functions, and the norm ||f\|,. of f € AC is its smallest Lipschitz 
constant, 

Filipe = sup ee 
r¥y - 
Notice that although C < L., nonetheless for the corresponding Lipschitz 
spaces we have AC = AL,,. Every ordinary Lipschitz function / (is absolutely 
continuous and) has a bounded derivative f’, and conversely the indefinite 
integral of a bounded function is Lipschitz. (We say the indefinite integral 
because the constant of integration is uniquely determined by our standing 
requirement that all functions have mean zero.) Furthermore, an easy com- 











676 H. MIRKIL 


putation shows that || ||,“ = sup,|f’ (x)|. Hence ||7.f — f||_.£© = ||T7.f’ —f'|l.. 
and the latter goes to 0 with x if and only if f’ is (uniformly) continuous. 
Thus we have 


THEOREM. In order that an ordinary Lipschitz function f translate con- 
tinuously in the Lipschitz norm it is necessary and sufficient that f have a 
continuous derivative. 


Example 2.2. For 1<p< @, AL, is the space defined and normed 


by 
T. 1 ely 
IIflle” = su p life — f= Sle = sup (| fx+y) -fO Pay)” 
~ 4 z>0 Xx 0 

THEOREM. at. consists precisely of all indefinite integrals ff of thef € L, 
And |\l\f\\|p = ||f'||,. In particular, every f © AL, translates continuously in 
AL,. 

Proof. Hardy and Littlewood (3 Theorem 22, p. 596, Theorem 24, p. 599.) 


Example 2.3. The space AZ; is defined and normed by 


1 alex iis 
fll? = sup> | f(x + ¥) — fQ)\dy. 
z>0 X 
THEOREM. AL, is precisely the functions of bounded variation. And 
fi|s? = ttl vwrn. (f) Im order that f © AL, translate continuously in AL, it 
is necessary and sufficient that f be absolutely continuous. 


Proof. Hardy and Littlewood (3). 

Thus the Hardy-Littlewood theorems together describe the “indefinite 
integral spaces”’ fl, of all the L, spaces, 1 < p < ~. We have no idea how 
to describe fF when F is an arbitrary translating space. But when the f € F 
all translate continuously in F, then we are able to prove (Theorem 4.11 
below) that fF is always the continuously translating part of AF. 


3. Fundamental properties of translating spaces. It will be useful 
at this point to assemble some general facts about continuous translation. 
None of the results below are new. Most of them appear in (2) and (5). 

Throughout this section, F will be a translating space, with norm ||f\|. 


LeMMA 3.1. Each translation T, is a bounded operator F — F. 


Proof. Weuse the closed graph theorem. Suppose f, — 0 in F and 7,f, — g. 
We must prove g = 0. Because F is a translating space, then 


f g(ye""dy = lim f (1 Sn) (ye™dy = lim frlyyen dy 
in alls falyedy = 0. 
n 0 


Hence g has all Fourier coefficients zero, and the lemma is proved. 











CONTINUOUS TRANSLATION OF FUNCTIONS 677 


Notice that we have not established a uniform bound on the set of all 7. 


LemMA 3.2. The set © F of all continuously translating f € F is a closed 
subspace of F. 


Proof. Silov (5, p. 5) uses a category argument. (But if a uniform bound 
is assumed on the set of all translations 7,, then an elementary computation 
with the norm can be used instead.) 


LemMA 3.3. There exists an equivalent norm on © F that makes all translations 
isometric. 


Proof. Silov (5, p. 5). 


LemMA 3.4. If F and G are translating spaces, with FCG, then 
fll» > const ||f\\¢ for every f € F. 


Proof. The above norm inequality amounts to asserting that the injection 

F — G is continuous. We use the closed graph theorem. Suppose that f, — 0 

in the space F, and that f, — g in the space G. Examine the Fourier coefficient 
2 


rs 1 .\ ,— the 7 
v«(g) = onde g(x)e “dx. 


Because G is a translating space, then by definition 7,(g) = lim, 7, (/,). But 
because F is a translating space, then lim, 7,.(f) = ¥,(0) = 0. Hence 7,(g) = 
0 for all k, and by the uniqueness theorem for Fourier series, g = 0. Thus 
F > G is continuous, and the lemma is proved. 


COROLLARY 3.5. The norm on a translating space F must satisfy 


\|f|| > const f (x) \dx. 


Proof. By having required all functions to be integrable, we have required 
F - Li. 

The corollary below is conceptually of the greatest importance, showing as 
it does that there is only one correct notion of convergence for the functions 
in a given translating space. 


COROLLARY 3.6. Let F be a translation-invariant space of (periodic, in- 
tegrable) functions. Then, among all norms ||f\| larger than the L, norm, there 
exists at most one (up to equivalence) that will make F complete. 


Proof. Immediate from 3.4. 


LemMA 3.7. Jf an exponential monomial e“* appears (with non-zero co- 
efficient) in the Fourier series of some f € F, then e“* itself belongs to F. 


Proof. Silov (5, p. 12). 


Let P be the set of all trigonometric polynomials }\c, e** (only finitely 
many ¢ # 0). 











678 H. MIRKIL 


THEOREM 3.8. The continuously translating subspace € F is precisely the 
closure in F of P C\ F. 


Proof. Silov (5, pp. 9 and 13). 


LemMa 3.9. Let F and G be translating spaces. Then in order that € F C €G 
it is necessary and sufficient that the following two conditions obtain: 
(1) FINPSCGMN\P 
(2) \|D||» > const ||p||¢, for each p € F. 
Proof. Since F(\ P = (@ F) (\ P, and similarly for G, the necessity of 
(1) is obvious. The necessity of (2) follows from Lemma 3.4 above. 
Conversely, suppose (1) and (2), and let f ¢ @ F. Then there are trigono- 
metric polynomials p, — f in the F norm. Since the sequence {p,} is Cauchy 
in F, then by (2) it is also Cauchy in G, hence p, — g € G in the G norm. By 
Theorem 3.8, actually g € @G. Finally, we recall that, by our definition of 


translating spaces, also p, — f in L; and p, — g in L;. Hence f = g, that is, 
fe €G. 

Corotiary 3.10. If FI\P =GO\P, with |\|p\\r = |\p\le, then © F = 
€ G, and conversely. 


4. Characterization of continuously translating Lipschitz functions 
(a = 1). This section will be devoted to the proof of Theorem 4.11 and its 
ancillary lemmas. Throughout, F will be a translation space, with norm ||/\|. 


Lemna 4.1. Jf f € © F has mean 


9 


J f(x) dx =0 


(as it always does in this paper), then also the Banach-valued function x — T,f 
has mean 


fore dx = 0. 
Proof. write 7, for the Fourier coefficient functional defined by 
Let 
g= fore dx. 


It is enough to show that y,(g) = 0 for all &. But 














iS 














CONTINUOUS TRANSLATION OF FUNCTIONS 679 


Qs ele 
v( T.f ax) = | ve(T,f) dx 
0 0 
ar Qe 
J J f(x + ye “dy dx 


2s 2s 
_ J f(y)e “dy dx 
. 0 0 


0 0 
y-«(1) »(/). 
Now if k # 0, then y_,(1) = 0. And if k = 0, then y,(f) = 0. 


ve(g) 


TuHeoreM 4.2. If f € © F, then 
IIfl| < sup ||7.f — f\|. 


z>0 


Proof. We have 


sup ||7.f — f|| > = | Tif — f\\ dx 
> ! 


5 | (T.f — f) dx| 


the last equality by 4.1 above. 


The idea used in the above proof can be made to yield a somewhat more 
delicate result. Namely, assume only that f € Z;, but that each 7,/f — / 
belongs to F and that ||7.f — || approaches 0 with x. Then necessarily 


f © F. (Question: Is it enough to assume 7,f — f belongs to F for small x?) 


Lema 4.3. AF = A@ F. 


Proof. AF = {f € F:||f\l@ < o}. And AGF = {f € CF: ||f\|M < @}. 
We need only prove that AF C @ F. But if f € AF, then||7.f — f\| = O(x), 
hence 


LEMMA 4.4. 


Proof. Immediate from 4.2, since by 4.3 we need only consider f € @ F. 


THEOREM 4.5. The space AF is complete for the norm |\f\|\". 











680 H. MIRKIL 


Proof. Write 





so that ||f|| = sup .||A.f||. Let {f,} be a Cauchy sequence in the normed 
space AF. By 4.4, {f,} is also Cauchy in the space F, hence f, converges in F 
to some f € F. 

We claim that actually f € AF. For since {f,} is Cauchy in AF, then 
{f.} is bounded in AF, hence there is some M < © such that ||A,/,||" < M 
for all x and for all m. Fixing x and letting m go to ~ we see also that ||A,f|| << M 
for all x. That is to say, ||f||" < M, and in particular, f € AF. 

We claim finally that {f,} converges to f in the space AF. By considering 
instead the sequence {f, — f} and changing notation, we can suppose that 
f = 0. Then given « > 0 we must find N such that ||f,||"° < ¢ for all m > N. 
Or equivalently, we want ||A-f,|| < « for all m > N and all x. Because {f,} 
is Cauchy in AF, we can find N such that ||A-(f, — fm)|| < € for all n,m > N 
and all x. Moreover, because each A, is continuous F — F, we can fix x and 
n and let m go to ~, obtaining ||A./,||” < ¢ for all nm > N and all x. Hence 
the completeness of AF is established. 

Let us remark in passing that any family of operators A, on an arbitrary 
Banach space F (each A, bounded, but no uniform bound assumed on all A,) 
can be used to define a new norm sup,||A,f||. And exactly as in Theorem 4.5 
above, we see that the associated subspace G is complete for the new norm if 
sup,||A.f||» > const ||f||». Furthermore this last, seemingly contrived, con- 
dition is actually necessary for completeness of the new space G (assuming 
that when f ~ 0 then at least one A,f # 0). To prove this necessity, we 
appeal to the closed graph theorem. Suppose f, — 0 in the new (complete) G, 
and f, ~f € F in the old space F. We want to show that f = 0. Because 
fa, — 0 in G, then for each ¢ > 0 we can find N such that for nm > N and for 
all x ||Azf,||r7 < «. And because f, —f in F, then ||A,f||» = lim, ||A.f,||r < «. 
But ¢ was arbitrary, and hence ||A,f||~ = 0 for all x, that is, f = 0. Thus we 
see that the injection G — F is continuous and ||f||¢ > const ||f||r. 








LemMMA 4.6. The trigonometric polynomial p belongs to F if and only if its 
derivative p’ belongs to F. 


Proof. By 3.7 it is enough to look at single exponentials e“*, and for these 
the Lemma is clearly true. 


LemMa 4.7. If p € P(\F, then 
Tp — P|! 
x 


converges to |\p'||. 


Proof. Since all vector space topologies on a finite-dimensional space are 
equivalent, then 




















ire 














CONTINUOUS TRANSLATION OF FUNCTIONS 681 


Tip —p 
x 
converges to p’ in the F norm restricted to the finite-dimensional space Py 
generated by the exponential monomials e“* !that actually appear in the 
expression p(x) = >> c.e**. And then 
Tsp — PI 
x 


converges to ||p’|| because the norm ||f|| in a normed space varies continuousl 
with f. 


LemMA 4.8. Jf © ts a differentiable mapping from the interval (0, x] into 
a Banach space, then 
[@(~) — &(0)|| 


= < sup || ’(£)||. 


O<t<z 


Proof. It is enough to prove that for an arbitrary continuous functional ¥ 


[¥(@(x) — @(0))| 


- < sup |¥('(E)))|. 


But f(€) = W((E)) is a numerical differentiable function, and the above 
inequality is then a consequence of the elementary mean value theorem. 

In order to state the following lemma succinctly, for any translating space F 
let us define [F to be the set of all indefinite integrals [f of all the f € F. 
Since f — ff is a one-one linear mapping, commuting with all translations 7,, 
we can make f F a translating Banach space by transporting to it the F norm, 
so that 


LEMMA 4.9. 


Pa( fer) -Pn@ AF) = PQ) F. 


Proof. PC\ ({@ F) = P(\(€F) by Lemma 4.6, and PC\(@ F) = 
PC) F by Theorem 3.8. On the other hand, P (\ (@ AF) = PC) (AF) again 
by Theorem 3.8. Clearly P (\ (AF) C P(\ F. The opposite inclusion will 
follow from Lemma 4.6 together with the second norm inequality below. 


LEMMA 4.10. For every p © P(\F, 
\|\7T.p — pl F 
\1p"|| < Ph PU < const |Ip'lI- 
Proof. The first inequality is an immediate consequence of Lemma 4.7. 


The second inequality uses Lemma 4.8, as follows. Set @(x) = 7,p, so that 
#(0) = pand #’(t) = 7;p’. Then 











682 H. MIRKIL 





7p — pil _ ||@) — 2(0)|| 


. z < sup |] °C)! I 


= sup \|7ep"|| < const ||p’| 





’ 


the last inequality by 3.3. This completes our last lemma of §4. 

We can now prove the characterization of continuously translating Lip- 
schitz spaces that generalizes the Hardy-Littlewood characterization of 
AL,. We again use the notation { F defined in the paragraph preceding 4.9. 


THEOREM 4.11. For every translating space F, with norm ||f\\, 
CAF =JS@F. 
In more detail: 
(1) f is absolutely continuous with derivative f’ € F, and ||7,f’ — f’|| varies 
continuously with x if and only if (2) ||7.f — f|| varies continuously with x. 
Proof. Recall that (||f||" is defined by 


sup 5 \|74f — fll). 


From 4.9 and 3.8 we know that P/\F is a dense subspace of both 
@ AF and {@ F. Furthermore, by 4.10, these two spaces induce equivalent 
norms on P () F. Hence by 3.10, @AF = [@ F. 


5. Characterization of continuously translating Holder functions 
(a < 1). Given a translating space F, with norm ||f||, and a positive number 
a < 1, we can define the Hilder seminorm 

1 [|7- a 
\If ie _ sup | sf - fll 
z>0 x 
and the associated Hélder space A*F exactly as we defined ||f|| and 
AF (= A'F) in §2. 

Example. For F = C, with ||f\| = |{f\|., the space A*C consists of the 
functions satisfying an ordinary uniform Hélder condition of order a. To see 
that not every f € A*C translates continuously for the Hélder norm, take 
some f = x* for 0 < x < x, and otherwise continuously differentiable (with 
mean 0 as usual). Then f € A*C. But for all positive x < 2/2, we have 


Taf — FIP = supe || (Tor — Ts — Ty + Dflle 
> sup 5s Ife +») — fle) - f0| 


> ze If(2x) — fx) - f(e)| 


| 
Chen 

= 
i) 

a 
wy 
- 

| 

a 
~ 

| 

= 
a 

II 
to 

| 

















f 


it 


iS 


=o @ 








CONTINUOUS TRANSLATION OF FUNCTIONS 683 


By analogy with Theorem 4.11 it is natural to conjecture that f translates 
continuously in A*F if and only if f has a continuous “fractional” derivative 


-¢ 


f@ « € F. Yet we shall see that, in contrast with the case a = 1, this condition 


is sufficient but not necessary. 

To describe the correct condition for continuous translation in A*F we 
must look at still another associated space A*F, defined by the requirement 
that 


; T,f —f 
lm +“. = 0. 
z=<0 x 
(|| || means the original F norm, as usual.) In the language of O and o, AtF = 


if € F: ||T.f —f\i] = O@)} and ACF = {ff € F: ||7.f —f|| = o(a*)}. We 
shall prove in Theorem 5.8 below that for a < 1 the continuously translating 
subspace of A*F is in fact \*F (while for a = 1 the space A'F contains only 
the function = 0). 


LEMMA 5.1. 


LEMMA 5.3. 


THEOREM 5.3. The space A*F is complete for the norm ||f\\@. 

The proof of the above theorem and its lemmas is exactly like the corres- 
ponding proof for AF in §4. 

LemMaA 5.4. If ||T.f — f|| = o(x*) as x0, then ||T,f — f|\|@ = o(1). 

Proof. Wecan assume without loss of generality that all 7, are isometries. 


By definition, 


\7.f —f\\ = sup ||7,(7.f — f) — (Tf -—f)\\y™* 


y>0 


= sup ||7,(7,f —f) — (Tif -—f)\\y*. 


y>0 
For o < y < x, we have 
I7AT,f —f) — (if — Alla < ||TTS — A | ly 

+ ||T.f — filo = 21|7,f — filo. 
And for y > x > 0 we have ||7,(7.f — f) — (7.f —f\|ly* < ||7,(7of -—S) 
x + ||7,f — f\lx-* = 2||7.f — f\|x-*. Hence 


\|7.f —f\\ <2 sup ||7,f -— fly, 


O<y<z 


from which inequality the lemma follows immediately. 











684 H. MIRKIL 


LemMMA 5.5. A*F is a closed subspace of A*F. 


Proof. Letf, —fin A*F, with f, € A*F. Choose f, such that ||f — f,||@ < «. 
Writing 


we have ||A,(f — f,)|| < ¢€ for all x. We want ||A,f|| +0. And we have 
|Adfl| < ||Azf — fa)|| + ||Asfal|. Hence 
lim ||Azf|| < « + lim ||A,f,|| = «. 


r0 
Since ¢ was arbitrary, then 


lim ||A,f|| = 0, 


n 0 
and f € A*F, proving the lemma. 
LemMA 5.6. ||7,p — p|| = o(x*) for every trigonometric polynomial p € F. 


Proof. By 4.7, 


Tsp — Pll 
x 
converges to ||p’||. Hence 
ne — al oan © 
UTsp = pit » Hide — PI x * +oasx—o. 
x x 
Lema 5.7. If ||T7.f —f\|* = o(1) as x -0, then ||7.f — f|| = o(x*). 


Proof. ‘The lemma asserts that © A*F C \*F. If f € © A*F, from 3.8 it 
follows that f is the limit in A*F of trigonometric polynomials p,. By 5.6 each 
Pa € A*F, and by 5.5 lim p, = f € A*F. 

The conjunction of Lemmas 5.4 and 5.7 is 

THEOREM 5.8. For every translating space F, @A*F = d*F. 

In other words, the following functions must both be continuous or both 
be discontinuous for x > 0: 

x— ||T.f — f\|@, 
x — ||T.f — f\lx-, 


the norm ||f||@ being defined as 
sup ||7,f — f\|y~ 
y>0 
Finally, to justify the remark about fractional derivatives in the third 
paragraph of this section, it will be convenient to use Zygmund's smooth 
functions (6), which by definition (are continuous and) satisfy 


Tf + Tf — 2f\||.,. = o(x). 








it 


th 





CONTINUOUS TRANSLATION OF FUNCTIONS 685 





As usual, || f||., = sup,|f(y)|. Let us write A*C for the space of smooth functions. 
(One can of course define A*F for any translating space F, and also A*F 
with O replacing o.) Let us write {@f for the Wey] fractional indefinite integral 
of f, defined to have Fourier series > (in)~“c,e“* when f has Fourier series 
¥ c,e"*. Zygmund (6 Theorems 11 and 12, p. 53) proves that [@f € »*C 
if and only if {f € A*C. It follows that if f is continuous then J@f € A*C, 
since every continuously differentiable function belongs to A*C. On the other 
hand, to demonstrate the existence of discontinuous f with sors E Aw, it is 
enough to exhibit an f € A*C that is not continuously differentiable. But in 
fact Zygmund shows there are smooth f that fail to have a derivative at almost 
all points. 


REFERENCES 


1. S. Bochner, Additive set functions on groups, Ann. Math., 40 (1939), 769-799. 

2. K. de Leeuw, Linear spaces with a compact group of operators, \\\. J. Math., 2 (1958), 367-377. 

3. G. H. Hardy and J. E. Littlewood, Some properties of fractional integrals I, Math. Zeit., 
27 (1927-1928), 565-606. 

4. A. Plessner, Eine Kennzeichnung der totalstetig Funktionen, J. reine angew. Math., 160 
(1929), 26-32. 

5. G. E. Silov, Homogeneous rings of functions, Uspehi Matem. Nauk N.S., 6 (1951), A.M.S. 
Translation 92. 

6. A. Zygmund, Smooth Functions, Duke Math. J., 12 (1945), 47-76. 


Dartmouth College 











ON LIE SEMI-GROUPS 
R. P. LANGLANDS 


1. Suppose we have a semi-group structure defined on 


w= {(p',..., 6") |p' > 0,...,p* > O},—7 


a subset of real Euclidean n-space, E,, by (p,q) — F(p,q) = pog. In this 
note we shall be concerned with a representation 7(.) of # as a semi-group 
of bounded linear operators on a Banach space ¥. More particularly, we 
suppose that postulates P,, P2, P;, Ps, and Ps of chapter 25 of (2) are satisfied 
so that, by Theorem 25.3.1 of that book, there is a continuous function, f(.), 
defined on x such that f((p + c)a) = f(pa)o f(ea) for a € x, p,o >O; 
that the representation is strongly continuous in a neighbourhood of the 
origin and that 7(0) = J. Then for a x, p— T(f(pa)) is a strongly con- 
tinuous one parameter semi-group; denote its infinitesimal generator by 
A (a). We shall study, under the assumption that F(, g) is three times con- 
tinuously differentiable, the relations among the A(a) and their adjoints 
A*(a). Following a suggestion of Hille (1) we first prove some ‘‘Dense Graph 
Theorems.”’ Using these we show that the expected linear and commutation 
relations hold. We also show that ()\,..D(A(a)) [D(A(q@)) is the domain 
of A (a)] is invariant under 7(p) for small p in x. The proofs are so formulated 
that, with minor changes, they remain valid in other situations. 

We should like to thank C. T. Ionescu Tulcea who suggested that the 
methods of (3) might be applicable to the questions discussed here. 


2. We set 
aF* 


ap =~ Fi. (p,q); 


a 


xk . a°F* 
aq’ _ Fs; (p, @) 


‘agrap? = 

*.7 (0,0) — F¥., (0,0) = y3,. 
F(p,q) may be extended to a twice continuously differentiable function 
defined on E, X E,. Denote some fixed extension by F(q, p). Since 


F'.(0,0) = F*,(0,0) = a, 
there are open spheres V,, N2 € N, about the origin and three times con- 
tinously differentiable functions ¥(g, h) and x(g, h) defined on NV, & N, such 
that ¥(0,0) = x(0,0) = 0, F(A, ¥(¢, h)) = g, and F(x(q,h),h) = ¢g. More- 
over if F(h, p) = q[F(p, hk) = q] with p,h € No, then g € N, and ¥(q,h) = 
b[{x(¢, 4) = p]. We may also suppose that all derivatives of ¥(¢,h) and 
x(q, 4) up to the third order are bounded in N,, that 7(.) is strongly con- 


Fi. (b, 9); 


Received June 30, 1959. 


686 





S 


on 





ON LIE SEMI-GROUPS 687 


tinuous in VN, /\ x, and that det(F,.*(0, p)) > 1/2 and det(F.,*(p, 0)) > 1/2 
for p in N,. If N € N, is an open sphere about the origin, set 


EN) = 49 = J K@T(@)xdg 





x €%,K(q) €C(NN 2) 


C*(N(\n) is the set of twice continuously differentiable functions which are 0 
outside of V (\ x. E(N) is dense in &. 


PROPOSITION 1. Let N; be an open sphere about the origin with F(N3, N3) C 
No. If y € E(N3), then T(p)y is a twice continuously differentiable function 
of pin N3(\-. 


Proof. We understand that some derivatives at the boundary will be 
one-sided. If y € E(N3) and e, = (6,',...,6,") we have, recalling that 
K(q) is 0 outside of NV; (\ z, 


lim s~"(T(p + ses)y — T(p)y) 


s+ 9 


= lim s~ Mis K@T(O + ses) 0g) — T(p oq))xdq 


. | r=—p+ se; 
= lim s* (x (W(q,7r)) aee(2¥, — (q,7 )) T (q)xdq 
s+0 Noh 


| pep 


= f — p)) “= x (g, i im + lim j G(q, p, s)dq 
Nfs ” s50 J NoNs 


Nole re 


since G(q, p, s) converges ait to 0 with s. The final integral is a con- 
tinuous function of p. In a similar manner we show that it is once continuously 
differentiable. 
We remark the following formulae, valid for y € E(N3), p © Na C\-: 
(i) lim s~*(T(f(sa)) — DT (p)y = lim s~*(T(f(sa) o p)y — T(p)y) 
s+0 


30 


im| s*(F’(f (sa), ),p) —p’ ry Ty + so(|f(sa) op — p | 


yi (> F,(0, p)a') 2 api Tey. 


So T(p)y € D(A (a)) and A (a)T(p)y is given by (1). 


(1) 


(ii) T(p)A (a)y = lim s“*(T(p of (sa))y — T(p)y) 
(2) => (> Fi,(, 0a") ae T(p)y. 
j=l k=l op 


(iii) Setting (y./(p)) = (F%x(p, 0)), 











688 R. P. LANGLANDS 


(3) pT (by -> yi(p)T (p)A (ex)y. 


(iv) Setting ap) >: F1.(0,p)v\(p), 
j=l 


(4) A(a)A(b)T(p)y = ott atio)o')( Fn;(0, p)a *) 3 T(p)A (ex)y. 
(v) (a) A(a+b)y = A(a)y + A(d)y 
(8) Ale )Alesy — AleAledy = p> VisA 


=1 


For a proof of the latter relation, see (2, p. 758). 


3. We come now to the “Dense Graph Theorems.”’ 


THEOREM 1. Let {a;,...,@,} © 2; tf Go ts the closure in the product topology 
on Xx...4X(p + 1 factors) of {(x, A(a:)x,...,A(a,)x)|x € E(N3)} and 
P 1 
= = {ce A(a,)x, ..., A(a,)x)|x € M D(A@,)) § 
j=1 


then Go = G. 


Proof. GD Go since an infinitesimal generator is a closed operator. We 
show that Gp D> G. Let {b,41,..., b,} be a maximal linearly independent 
subset of {a;,..., @,}; it is sufficient to prove the theorem for the former set. 
Let {b,,...,64,3 Gwe be a basis for E,. If ¢ = (#',..., t") € x, set p(t) = 
f(t'by)o . . . of (tb,). p(t) is a twice continuously differentiable map of z into z 
and may be extended to a twice continuously differentiable map of £, into 
E,. Since 


2, ; (0) = 


p(t) has a twice continuously differentiable inverse defined in a sphere .V, about 
the origin. We may suppose that F(N4, V4) C N; and that all derivatives of 
the inverse function up to the second order are bounded in Ny. If y © E(.N4) 
and p € N4f\n then T(p)y € E(N;). For y € E(N4), set 


n —1 
u(y,s) = (Ti :) J SO)ydt, 


where s = (s',..., 5"), S(t) = T(p(é)), R(s) is the rectangle with ‘sides 
[0, s’e;], and R(s) is contained in the image of N, under the inverse map. 
Using (1) 


A (by)u(y, 5) 1 (b,;) S(t)ydt 


n —1 2 n 
j i a) ~ 
s (.(t) —z S(t)ydt, 
(Ti :) J. s) ial $a ot ( 


ll 
para, 
Tt 
eae” 
| 
—? 


ll 








ON LIE SEMI-GROUPS 689 


where 
f(t) = > Fi:(0, p(t))or a 


is once continuously diferentiable. Integrating by parts, 


(4) A(b)u(y, s) = (11 si ib> | oi(t)S(t)y dt". 


i J 0) \4 
_f ¥ iosim|. 
R(s) i=1 


Since the integral of a function with values lying in a closed subspace of a 
Banach space is contained in that subspace 


(5) (u(y, s), A(b,41)u(y, s),..., A(b,)u(y, s)) € Go. 


Since (4) is a continuous function of y and E(N,) is dense in %; for any y € &, 
n VD 


u(y,s) € j'.D(A(b;)) and (4) and (5) hold. To complete the proof it is 
sufficient to show 


(6) lim u(y, s(o)) = y. 
e+0 
(7) lim A (b,)u(y, s(0)) = A(d)y 
o,0 

for R>rtil, y © C\ gergs"D(A()), and s(c) = (¢,..., a). (6) is clear; 
to prove (7) we expand ¢;,‘(t) in a Taylor’s series and consider 

. ~ 1 i . (t a ie 

| , t) S(t) | - Ut 

ga Seren ( y (t*, 0) ‘ 


= lim [oe f bio (S(1', a)y — S(t’, 0)y)at' +- 
R(t ) 


0 


+o f , Or (0) S(i', o)ydt! + 
R(s*) 


at’ 
+ = : (¥ o af 185 @)) (SG, oy — S(t‘, 0)y)dt' + o(t) | 
R(s*) jFi 
F) 
A(hy)y + 4 af (0)y 
provided 
(8) lim o(S(t*, yy — S(t*, O)y) = A(d,)y. 


But the left side equals 
k—1 
[] rseb,))| MT f(obk))y — y) + 
=] 


+ (T(f(ob,)) nS ng TSC bm yo" (TS (t'b)))y ») | 


lm 











690 R. P. LANGLANDS 


and (8) follows if we recall that ¢‘ < ¢ and that y € D(A(d,)) fori >k> 
r + 1. Summing over i and taking the last term of (4) into account we obtain 


(7). 


THEOREM 2. If Fo is the closure in the product topology of 


{(y, A(e:)y,..., 4 1(e,)y, AleAlesy)|\y € E(Ns)} 
and if 
F= 
1, A(e1)¥, ... 54 1 (e,)y, A (e,)A (e;)y¥) ly E€ la D(A (ad) D(A(e)A(e,))f 


then F = Fo. 
Proof. Fisaclosed set and thus F D Fy. We show Fy D F. Taking & = e, 


we use the notation of the proof of Theorem 1. For y € E(.N,4) 


A (e,)A (e;)u(y, s) A (e,;)A (e;) S(t) ydt 


ll II 
a, gare 
i i 
= ag 

> oy Py 


> 5m 5 =m (S(t)A (ex) y)dt 


R(s) k.m=1 
where 
n ot” 
k ww 
dn(t) = Do Bi(p(t)) Fi, p(t)) —> 
r=1 op 
is once continuously differentiable. Integrating by parts 


(9) A (e,)A (e;)uly, s) = 


= r | (¢", im 
- (Ti : y LE Son > bn(t)S()A (er) IY (gm ° ) a 


“ ‘= ie: - (t) S(t)A (e: oat | 


Theorem 1 implies that (9) holds for y € (\%..1D(A (e)). The proof is then 
completed as above. 


4. We now consider the adjoints of the infinitesimal generators. If y* € %* 
we denote the value of y* at y € ¥ by (y, y*). If N C M,, set 


| ’ 
EX(N) = \9* €X*|0,9*) = [ O. K@T*@x*) da, 
with x* € ¥* and K(g) € C?(N\7)} , E*(N) is dense in ¥* in the weak* 
topology. 


PRoposiTION 2. If y* € E*(N;), T*(p)y is twice continuously differentiable 
in the weak* topology for p in N3(\ x. 











Sa! 


] 
( 





ON LIE SEMI-GROUPS 691 


Proof. We merely sketch the calculations since the proof is essentially the 
same as that of Proposition 1. 


lim ‘fo. K(q)(T*(p + se;) — T*(p))T*(q)x*) dq = 


lim | (y, K(q)(T*(¢ 0 (p + se;)) — T*(q 0 p))x*) dq 


s+0 


: a ; - 
Sin. ap’ (xix0. P))  der( 2% i (q, »)))7*(@a*) da 


The last integral is again a continuously differentiable function of p. 
We remark the following, valid for y* E*(N;) and p € N3f\ a: 


(i) lim s7* (y, (T*(f(sa)) — I)T*(p)y*) = 


s0 


(10) -> (> Fn (P, ve) = (y, T"(p)y"). 


m=! 
This implies that 7*(p)y* € DfA*(a)) and that (y, A*(a)7*(p)y*) is given 
by the right side of (10). 
(ii) As in the remarks following Proposition 1 we may show 


(a’) A*{a + b)y* = A*(a)y* + A*(b)y* 


(8’) A*(e,)A*(e,) y*— A*(e,)A*(e)y* = — DS FyA*(ex)y*. 
k=l 
THEOREM 3. Let {a,,..., a,}| Cm, if Ho is the closure in the product of 
the weak* topologies of {(y*, A*(a;)y*,...,A*(ap)y*) |\y* © E*(N3)} and 
H = {(y*, A*(a,)y",..., 2 1*(a,)y*) | y* CV jmy’D(A (a,))} then H = Ho. 
> 2 Le ¢ I 
Proof. H D Ho since A*(a) is closed in the weak* topology. We show 
H, > H. Let {b,,..., b,} be a maximal linearly independent subset of 
ere a,}; it is sufficient to prove the theorem for the former set. Let 
is iar b,| © w be a basis for E,. Again we use the notatic.. of the proof 


of Theorem 1. If y* € E*(N4) set 


n —1 . 
(y, u(y*, s)) = (11 s / { (y, S*(t)y*)dt 
1 @ Ris) 


j= 


with S*(t) = T*(p(t)). As above 
(11) (y, A*(d,)u(y*, s)) = 


” 1 n . i. A 
(Ti :) F- j EL(e)(y, S*(y*)| 4S? at 
j=l J R(>t) ‘i (f ,0) 


- | ye ok i (t)<y, S*(t)y*)dt 
© R(s) 


i=l 


with 


ti(t) = > F*; j(P(t), 0) ot 


j.m=1 














692 R. P. LANGLANDS 
As above, u(y*,s) € (\t-1D(A*(d,)) for all y* € ¥* and A*(d,)u(y*, s) is 
given by (11). Moreover, 
(u(y*, s), A*(b:)u(y*, s),..., A*(b,)u(y*, s)) € Ho. 
The proof may be completed as before if we show that 
(12) lim o"{y, (S*(t*, «) — S*(¢*, 0))y*) = (y, A* (be) 9") 


for 1 Ck <r, t? Co, and y* € 7) {-1D(A*(5,)). But the expression on the 
left equals 


CTL Tye)», o '(T*(f(ob,)) ou ny*> 


k-1 k—1 
+> CT T(f(t"bm))(T(f(obx)) — D) 
i=—1 =i+l1 
. . —k- i ‘ 
I] TU@b,))y, o (T*(F'b) — Dy): 


and (12) follows since, see (3), o—'(7*({(t'b,)) — Z)y* is uniformly bounded 
and o'(7*(f(eb,)) — D)y* converges in the weak* topology to A*(d,)y*. 


>» Tf o = @,..., a") € E,, Ala)y = J jar"a’A(e,)y is defined for 
y € E(N;). By the remarks after Proposition 2, E*(N3;) is contained in the 
domain of its adjoint so that A (a) has a least closed extension which we again 
denote by A (a). This notation is consistent with that used previously. 


LeMMA. A*(a), the adjoint of A(a), is the weak* closure of the operator 
> j=1"a/A*(e;) with domain E*(N;). 


Proof. Suppose (y, x*;:) = (A(a)y, x*2) for all y € E(N;). Then, using 
Theorem 1 and the notation of its proof with b,e,, for y € % 


n nm . ae ‘ a 
of (S(t)y, xi)dt =o" > | > : (55()S(@)y, x2)| ee °) ait 
Ri s(e)) R(s*) ; 


a7 J (= a wists, stat 
R(s) i=1 at 


Transposing and taking limits 


limo" >> a’ . (y, (S*(t, ¢) — S*(t’, 0))x2)dt? = (y, x1). 
j=1 @ Ris?) 


a0 


Then, using (11), 
n 


. * * 
lim (y, , a’ A*(e;)u(x2, s(o))) = (y, x1). 


740 j=1 


The lemma is now an easy consequence of Theorem 3. 





he 


or 





ON LIE SEMI-GROUPS 693 


By Theorem 25.8.1 of (2) the y,," may be used to define a Lie algebra 
over E,. We denote the Lie product of a and 6 by [a, 6]. The following theorem 
can easily be proved using the lemma, formulae (a), (8), (a’), and (@’), and 
the Hahn-Banach theorem. 


THEOREM 4.1. The function a— A(a) defined on U has the properties 

(i) Ifx € D(A(@)) A D(A (d)) thenx € D(A(sa + tb)) and A(sa + tb)x = 
sA (a)x + tA (b)x. 

(ii) Jf x € D(A(@)A(b)) (V\ D(A(S)A(a)) then x € D(A(fa,b])) and 
A(fa, b])x = A(a)A(b)x — A(b)A(a)x. 

Il. The function a — A*(a) has the properties 

(i) If x* € D(A*(a)) (\ D(A*(b)) then x* € D(A*(sa + tb)) and 
A*(sa + tb)x* = sA*(a)x* + tA*(b)x*. 

(ii) If x* € D(A*(@)A*(b)) \ D(A*(b)A*(a)) then x* € D(A*({a, 5))) 
and A*({a, b])x* = A*(b)A*(a)x* — A*(a)A*(b)x*. 


Recalling that if a sequence of once continuously differentiable functions 
and the sequences of first order derivatives converge uniformly on some domain 
then the limit function is once continuously differentiable and its partial 
derivatives are the limits of the sequences of partial derivatives, we have, 
using (3) and Theorem 1, 


THEOREM 5. If y € C\ ,1\"D(Ale,)) then T(p)y is once continuously 
differentiable in some neighbourhood, in x, of the origin and (3) holds. Con- 


-~ 


sequently, T(p)y © D(A(a)) fora € E, and p in this neighbourhood. 
The following theorem is an immediate consequence of Theorem 2. 


THEOREM 6. If y © 0) ,21"D(A (&)) A D(A (eA (e;) then y © D(A (e,)A (e,)). 


REFERENCES 


E. Hille, Lie theory of semi-groups of linear transformations, Bull. Amer. Math. Soc., 56 (1950). 

E. Hille and R. S. Phillips, Functional analysis and semi-groups, Amer. Math. Soc. Coll 
Publ., 37 (1957). 

3. K. de Leeuw, On the adjoint semi-group and some problems in the theory of approximation, 

Math. Zeit., 73 (1950). 


1. 
2. 


Yale University 











THE EXPRESSION OF TRIGONOMETRICAL SERIES IN 
FOURIER FORM 


GEORGE CROSS 


1. Introduction. In a paper published in 1936 Burkill (2) proved that, 
if the trigonometrical series 


(1.1) > c exp(int), Cn = An — iby, 
n=1 


is bounded except on a countable set and if the series obtained by integrating 
series (1.1) once converges everywhere, then the coefficients can be written 
in Fourier form using the C,P-integral. In §3 of this paper an analogous result 
is shown to be true when (1.1) is bounded (C, k), k > 0. The proof of this 
depends on generalizations of theorems by Verblunsky and Zygmund and 
both of these generalizations are obtained in §2. 

For convenience, the definitions of the de la Valée Poussin derivative 
(cf. 6, p. 59) and the C,P-integral (1) are given here. 


DEFINITION 1.1. Let g(t) be a function defined in the closed interval |a, 6}. 
If, for a given ty in {a, 6}, 
g(to + h) = co + cyh + coh?/2! 4+... + oh*/k! + of(h*), 


as h +0, where the numbers c, = c,(to) are independent of h, then c, is called 
the kth de la Vallée Poussin derivative of g at the point to and is denoted by 
Bix) (to). If o(t) = f(t) + ig(t), then a(t) = f(t) + iga(t) wherever fi) (t) 
and gix)(t) are defined. 


The C,P-integral is defined by induction. Suppose that for m > 1 the 
C,-1.P-integral has been defined taking as the C )P-integral the Perron 
integral (4, p. 201). Assuming that u(t) is C,_,P-integrable, let 


C,(u, t,t + h) = (n/h")C,_1P { (t +h — &)”‘u(é)dé. 


DEFINITION 1.2. The function u(t) is said to be C,-continuous at to if 
C,,(u, to, to + h) — u(to) as hh 0. 


DEFINITION 1.3. The upper and lower C,-derivates of u(t) denoted by 
C,D*u(t) and C,Dsu(t), respectively, are defined to be the lim sup and the lim 
inf, respectively, as h — 0 of the expression 


fs + 1) ctu, t,i+h) — u(t)). 
Received June 9, 1959. 
694 





w! 





TRIGONOMETRICAL SERIES IN FOURIER FORM 695 


DEFINITION 1.4. Jf C,D*u(t) = C,Deu(t), their common value is defined to 
be the C,-derivative of u(t) and is denoted by C,D u(t). 


DEFINITION 1.5. The function M(t) is said to be a C,-major function of u(t) 
over (a, 6] if 


(1.2.1) M(t) is C,-continuous; 

(1.2.2) M(a) = 0; 

(1.2.3) : C,D+M(t) > u(t), p.p. in [a, ); 
(1.2.4) C,DsM(t) > — © in [a, d}. 


A C,-minor function m/(#) is defined in a similar way. 


DEFINITION 1.6. If, for every « > 0, there is a pair M(t), m(t) satisfying the 
conditions of Definition 1.5 and such that |M(b) — m(b)| < «, then u(t) is said 
to be C,,P-integrable over {a, b). 


DEFINITION 1.7. Let I(b) = lower bound of all M(b) and J(b) = upper 
bound of all m(b). For a C,P-integrable function u(t) the bounds have a common 
limit (1) which is called the definite C,P-integral of u(t) over {a, 6). 


If o(t) = f(t) + ig(t) then the definition of the C,-derivative is extended to 
¢(¢) in the usual way, and 


C.P J o()dt = CPS f()dt +iC,Pf g(t)dt, 


whenever the integrals on the right-hand side are defined. 


2. The integrated series. 
THEOREM 2.1. Let the series 
(2.1) > Cn exp (int) 
n=] 


be bounded (C, k) fora fixedk = 0,1,2,...,andt € E,|E| > 0. ]fr =k +2, 
then for each t € E, 


(2.2.}) > [c, exp(int)/(in)’] = H’*(2), (C,r —j —1) 
n=1 
forj =1,2,..., 17. Further the series (2.2.r) converges absolutely and uniformly 


to H®(t)[ = H(t), say] for allt € Eand H,,)(t) exists and is finiteO < s <r —1, 
t € Eand 





(2.3) H(t) = H*(t). 
Furthermore for allt € E 
2.4) H(t+h) = H(t) + hHy(t) +... 4+ 
ae w(t,h)., 
G — 1 ten + rl h 


where w(t, h) = O(1) as h 0, and in particular H,,)(t) exists p.p. in E. 











696 GEORGE CROSS 


Proof. Itisclear that c, exp(int) = O(n*) and this is sufficient to guarantee 
the convergence property of series (2.2.r). The summability (C, k) of series 


(2.2.1) and the summability of series (2.2.2), (2.2 ng , (2.2.7 — 1) follows 
from two theorems by Hardy (3, Theorem 71, 128) and (3, Theorem 76, 
p. 131). 


To obtain (2.3) and (2.4) it may be assumed without loss of generality that 
= 0. Let 
exp (iu) 

(iu)” 


P(h) = 5 Gh) (kh) = sp 


y(u) = 


and for any sequence {u,} let Au, = A'u, = Uy, — Ungy, A’u, = A(A*'u,). 
Then Zygmund’s proof (6, p. 66), with condition s,* = o(m*) replaced by 
= O(n*) yields 


r—1 
(2.5) Hh) = > (: .) + h'R(h), 


where A, = > s,"A**'(in)*-’ and R(h) = > s,*A**'A(nh) both converge 
absolutely, and R(k) = O(1) as h 0. Thus 


r—1 
H(t +h) = H(t) + hHa(t) +... + —— | Ho-n(t) + wh, ) yr 
where w(t, h) = O(1) as h > O. It follows from a theorem due to Marcinkie- 
wicz and Zygmund (6, p. 76) that H7,,)(¢) exists p.p. in E. 
Equation (2.5) gives H,,»)(O) = A,_,; = & s,*A**'(in)~’, and since the 
(C,k) sum of the series > c,/(m)’ equals the (C,O) sum of the series 
> s,*A**'(n-4), (3, p. 128), (2.3) is established. 


THEOREM 2.2. Jf under the hypothesis of Theorem 2.1 the set E is an open 
interval and 


CDH = & 
dt 
then 
(2.6) C.DH,.)(t) = Hosen () Ocsckt € E, 
(2.7) Cre tDH ary (t) = Hin (tb) p.p. in E. 
Further, if H..)(t) = Fis) (t) + 1G. (t), O < s < 1, then for allt € E, 


(2.8) | Ce41.D* Fresy (0)| < o, Cer 1 De Fess (0)| < o, | Ce41D*G (x41) (2) =< @&, 
1Cri1DeGeusr (t)| < @ 


The following lemma is required for the proof. 





an 








TRIGONOMETRICAL SERIES IN FOURIER FORM 697 


LemMaA. If x a 

n=1 

is summable (C,r + 1), where r > — 1, then a necessary and sufficient condition 
that it should be bounded (C, 1) is that B,”’ = O(n’+') where b, = na, and B,®, 
B,', B,?,..., are formed from the b, as A,°, A,', A,”, ...are from the a, (cf. 3, 
p. 96). 


The relation (2.6) will be proved by induction. The result is trivial for 
s = 0 and in view of (2.3) reduces to a lemma of Verblunsky (5, p. 206). By 
the lemma stated above, series (2.2.1) is bounded (C,k — 1), series (2.2.2) 
is bounded (C, k — 2),..., series (2.2.r — 1) is bounded. Assume that the 
relation holds for all s < k and hence that H,,)(t) isa C,P-integral of H,,+1) (t) 
for all s < k. Then (k — 1) integrations by parts (1) gives 





C.DH p(t) = 
(k/h")C,1P fro +h — u)*"Hy(u)du — Ha (t) 
0 Fk + 1) 
= lim [et | axe +h) — H(t) - > (“Veto |, 


and, by Theorem 2.1, this limit equals Hg (¢). 
It can be shown similarly that 


Crs rDH e+1 (0) - 
9\I k+1 n 
lim [242 | aa +h)- HW) - D (“Vata | 
aa n= } 


if the limit on the right-hand side exists. By Theorem 2.1, therefore, 
Crs1DH a+ (t) exists p.p. in E and is equal to H,,)(t). 
Finally, it follows from (2.4) that 


k+1 n 
go) AEP EK +h) -HO-¥ (!: ites | = 0(1) 


n=1 n! 


as h — 0. This establishes (2.8). 


3. The expression of coefficients in terms of the C,,,P-integral. 
This section contains the main result of the paper. 


THEOREM 3.1. Under the hypothesis and with the notation of Theorem 2.1 
and Theorem 2.2, if E = [— x, x], then 


G = - CorP f Hi (t) exp(—int)dt. 
Proof. To fix ideas take k = 2. In virtue of (2.6) it is clear that 


t 
Hy) (t) — Hy,)(— 7) = Cor f Hs (x) dx, 0 < 5 < 2. 











698 GEORGE CROSS 


Furthermore, since H,4)(t) = Fi (t) + iGi)(t), it follows from (2.7) and (2.8) 
and the C;-continuity of F;3)(¢) and G,s)(t) that 


et 
H¢3) (t) _— H.3)(— x) = cP f Ha) (x)dx. 
Hence, using the property of integration by parts for the C,P-integral, 


cP { H(t) exp(—int)dt = cP f Hs) (t) (—in)exp(—int)dt 
= cp f H(t) (—in)’ exp(—int)dt 
= CoP | H(t) (—in)* exp(—int)dt 


= cop { H(t) (in)* exp(—int)dt = 2xc,, 


since series (2.2.4) converges absolutely and uniformly to H(t). This proves 
the theorem. 


CoroLiary 1. Jf series (2.1) is summable (C,k) for all t to a function 
v(t) = u(t) +2 0(t), |\W(d)| < ©, then the coefficients can be written in the form 





1 ’ 7 _ 
Ce = 5— CeriP | v(t) exp(—int)dt. 
a7 v—r 


Proof. It is well known (6, p. 69) that if /7(¢) is defined as in Theorem 2.1, 
then H,,)(t) = W(t), and the proof follows from Theorem 3.1. 


CoroLtary 2. (The real analogue of Theorem 3.1.) Jf ¥ A,(x) and 
> B,(x) are bounded (C, k) in |— x, x] then 


ee , YY irae 
an - J F.,(t) cos(nt)dt = - | G.,)(t) sin (nt)dt, 
Tr _— T . 


by 


1 wT ‘ : = j Tr . 
— | F(t) sin(nt)dt = — j G.)(t) cos(nt)dt, 
T fr T Jus 


where F(t), Gc»(t) are as defined in Theorem 2.2 and the integrals are 
Cy41P-integrals. 


REFERENCES 


1. J. C. Burkill, The Cesdro-Perron scale of integration, Proc. London Math. Soc. (2), 39 (1935), 
541-552. 

2. ———— The expression of trigonometrical series in Fourier form, J. London Math. Soc., 11 
(1936), 43-48. 

. G. H. Hardy, Divergent series (Oxford, 1949). 

- S. Saks, Theory of the integral (Warsaw, 1937). 

- S. Verblunsky, On the theory of trigonometric series VII, Fund. Math., 23 (1934), 193-236. 

- A. Zygmund, Trigonometric series (2nd ed.; Cainbridge, 1958), II. 


oun & 


University of Western Ortario. 





3) 





ON A DISCRIMINANT INEQUALITY 
L. J. MORDELL 


The following result has been conjectured by Dr. Birch. Let 2), 22, . . . 
be any n complex numbers such that 


+ @n 


(1) |g,|* + |eel? +... + |z,|? = nm. 
Then 
n 
(2) A= I] 3, — 8,| 
T> srl 


attains its greatest value when the z are at the vertices of a regular n-sided polygon 
inscribed in the circle |\z| = 1. 


It seems to be difficult to prove this but Dr. Birch informs me that some 
work by Mullholland! shows that the result is false for large n. I can, however, 
prove that the result is true for = 3, and then A < 27. The suggested 
general result would be A < n”. 

I show first that the maximum value of A arises from values of z satisfying 
either the equation 


n 
l l 
(3) fe = 5G — Ne, (ry = 1,2,...,m), 
sr 
where =, denotes the conjugate of z,; or the equations typified by z, 0 and 
1 
l s l l . 
(4) —+)> - - = 5 (n — 1)2, (y7 =1,2.....”2—1). 
2, oni 3, =~ 8; é 
“Fr 
The conjectured result is then proved for » = 3. It is also proved that the 
result is true if we impose the condition that the 2’s lie on the circle |z l. 


In the original version of this paper, this result was used to prove the result 
for n = 3. This led to a very interesting maximum problem in two variables, 
namely, 

Problem. To find the maximum value when x* + y? <1, x > 0, y > O, of 
(5) f(x,y) = (2° + y¥' — Rey) (1 — x’) (1 — »). 

Though the deduction of the conjectured result for m = 3 is not short, it 
seems worth-while reproducing the original proof since the ideas involved 
may be of further use. I think the method may give the greatest value of 
A for n = 4, but this I leave to others. 

The general problem was brought to my notice by Dr. J. H. H. Chalk, 
who after reading the original version of my paper, informed me that the 
conjecture was false for m > 6. His counter example is given by 


Received April 13, 1959. 
‘*Tnequalities between the geometric mean difference and the polar moments of a plane 
distribution,”’ Journal of the London Mathematical Society, 33 (1958) 260-269. 


699 











700 L. J. MORDELL 


3 es 
n 2air 
= => EEE => 9 — 
s, = 0 and Sy (_* -) exp(2# -) (ry = 1,2,...,” — 1). 


The equation (3) (but not (4)) was communicated to me by Dr. Birch 
after | had written the original version of this paper. He did not give the 
factor $(m — 1) in (3). 

For the general case, write 





z, = rye", .. oe = re, a>... Oe 
Then 
(6) A = I(r} + r2 — 2ryre cos (0; — 62)), 
where 


ntit...t¢n=n 
Suppose first that no r is zero. Then we can apply Lagrange’s method of 
undetermined multipliers. Hence we have two sets of equations, one typified 
by the two 





. r; — r, cos(@ 6.) 
et 1 ae 
(7) ei n+, — Qy, cos(@; — @,) — an = @, 
= a= Te cos(@2—6,) fa 
(8) Lu; rs + 7, — 2rer, cos(O2 — 6,) ata = 6, 
s¥#2 


where A is an undetermined multiplier, and the other typified by 


= ris sin(@:—6,) 
(9) 2d, r +r, — 2rir, cos(@; — 6,) line 

Multiply the equations (7), (8), ..., by r1, 72. .., and add. Then A = 4(m—1). 
Multiply (9) by —i/r; and add to (7). Then 





. 1 
Hence 
n ; 1 i , 
(10) > sz tie — 1) 


Adding the equations typified by (10), we find 


n 


(11) > 2, = 0. 


s=l 
Suppose next that some r are zero. Clearly at most one can be zero, say 


= 0. Then we have 








A = rirt... reall (ri + 73 — 2ryre cos(0; — 62)) 
where the product is extended over 7, ro, . . . , T»-1- Lagrange’s method now 
gives 
1 aay r; — r, cos(6; — 8.) 
12 _ — pr, = 0, 
(12) ry ee” ri +r, — 2rir, cos(6; — 8,) tii 
- rs sin (6; — 0,) 
13 + a = 0 
( ) ee nt n= 2rir,; cos (6; = 6,) ) 





Or 
ha 


or 


al 


_., i ea eet oo 








ON A DISCRIMINANT INEQUALITY 701 


On multiplying the equations typified by (12) by ri, re, . . . , and adding, we 
have 
n—1-+ 4(m — 1) (n — 2) — un = O, 
or 
w= 3(n — 1). 
Hence proceeding as before, we have the equation (4). 
We now prove the conjecture for n = 3. The equation (3) gives 

(14) ; + - = Z, 

21 — 22 21 — 23 
From (11), 2; + 22 + 23 = 0, and so (14) gives 


32) om 2, (2; — Ze) (2; — 23). 





Hence 
32, = 2,(2,; — 22) (21 — 2;), 
and so on multiplying these together, 
\Zy a 22 ley — 23| = 3. 
Clearly 


21 — Z2| = |22 — 23) = |23 — 21| = 3}, 


and so 2, 2, 23 are at the vertices of an equilateral triangle whose side is of 
length 34. Also the incentre of the triangle is at the origin. Clearly A = 27. 
Suppose next that z; = 0. The equation (4) gives 
1 1 ein 1 , 


— —aeee OE i. o " - = 2. 
21 21 — 22 22 Ze — 2) 





Add these and take also the conjugate equation. Then on multiplying these 
together, we find either 
2, + 22 = 0, or |2; 22| = 1. 
Since 


we find either 


Then 


22 2 <( 
A = 2322(%: — 22) => or i. 


Hence these values of the z do not give the greatest value of A. 

We now prove the general conjecture when we impose the condition that 
the z lie on the circle |z} = 1. The equation (7) is true independently of 
Lagrange’s method when \ = }(m — 1) since now the r’s are equal to 1. The 
equation (9) with the r’s equal to 1 still arises by Lagrange’s method applied 
to the 6. Hence (10) is still true and now 2, = 1/2;; and so the z satisfy equa- 
tions typified by 











702 L. J. MORDELL 


_ “ 1 1 1 
(15) ae. @ 3 1) >. 
Let z = 21,'22, . « « » 3, be the roots of the polynomial equation f(z) = 0. Then 
(15) gives 
f’’(z) a n ty 1 
7a) oa 
and so 
(16) 2f’’(z) —(n—1)f'(z) = 0 
for z = 21, Z2,..., Z,- Since (16) is of degree m — 1 in z, (16) holds identically 


in z. Hence 
log (f’(z)) = (m — 1) logz + loge, 
f'(2) = cs" 
Os 
f(z) = n = + C2, 
where ¢;, C2 are arbitrary constants. Hence the result. 

Since |z| = 1, the equation f(z) = 0 must be equivalent to 2" — e*«”" = 0 
where a is real. This shows that 2, 22, . . . ,2, are at the vertices of a regular 
n-sided polygon inscribed in the circle |z| = 1. To find A, there is no loss of 
generality in taking a = 0. Then the vertices are at 2, = e***"", (r = 0, 
1,...,m#— 1). Then for these z, 

A = Il,«,(2 — 2cos(0, — @,)) 


(17) 
= pA?” sin*( &= 4) =n" 


follows from 


This is deduced from 


i 2rar 
cos(n#) — 1 = 2°" [] | cos@ — cos “4 


r=(@ 


on dividing by cos @ — 1 and putting @ = 0. Then obviously 


- 32 Mews 6, n—l1\n 
I] sin'(%=*) = (n/2 = yl 


I now give the original proof of the general conjecture when nm = 3 found 
by using the result above. Write 


A = I(r} + 73 — 2rir2 cos(0; — 62)), 


where 

(18) rnitrt rs = 3. 

The greatest value of A cannot arise when r; = 0, since then 7,2 + 7.2 = 3 
and 


A < rirk(r, + re)* < 27/2. 





hen 


lly 


ind 





ON A DISCRIMINANT INEQUALITY 703 


Suppose first that all the cosines are < 0. If cos @ < 0, then 
(19) x? + y? — 2xy cos @ < 2 sin? (46) (x? + y?’) 
since 
(x — y)*cos# < 6. 
Hence 
A < 8x(ri + r2)II sin*(0; — 62) < 8-8-27/64 < 27 
from (18) and (17), equality arising only when r; = re = r3 = 1; and this 


is the case of the equilateral triangle. 
Suppose secondly that all the cosines are > 0. Then 


A < r(r;" + ro") < 8. 
Suppose thirdly that only two of the cosines are > 0, say cos (6; — 62) < 0. 
Then 
A < (ri + 72 — 2rire cos(O; — 62)(ri + ri) (ri + 75), 
< (ri + 12) (ri + 15) (72 + 13) < 16. 
Suppose finally that only one of the cosines > 0, say cos (6; — 62) > 0. Then 
from (19) 


A< i((r — 12)” + 4rire sin'(%5 )) 


x (ri + ri) (ris + 12) sin’( = tt) sin = *) 


< Ars — 12)*(ri + ri) (rs + 073) + U6rara(ri + 72) (72 + 15)27 /64 
on noting (17). Hence since r;? + r;2 = 3 — r,’, etc., 
4A < (3 — ri)(3 — r2)(16ri — Srire + 1672). 
We require the maximum value of the right-hand side where r;* + ro’? < 3. 
On putting 7,’ = 3x°, r.? = 3y*, we are led to the 


Problem. To find the maximum value M of 


(20) f=f (x,y) = (2° + ¥ — kxy) (1 — x) (1 — y’) 

when ?e@+y<l,x>0,y>0. 

Clearly M > } on taking x? = 4, y = 0. We prove in particular that when 
k = *, then M = } arising from x* = y’ = 3 or from xy = 0,x° + ¥ = }. 


These give 4A < 108. We note that xy = 0 does not lead to a maximum value 
of A. 
Write 


x = /rcosé, y = v/rsin #6, s = sin 6 cos 8, 


and so 











704 L. J. MORDELL 


Then 
(21) f(x,y) = g(r,s) = r(L—ks) 1 —r+rs*), 
Several cases must be considered, and so we denote by M,, Mz, ... , possible 


values among which M must be found. We first investigate possible maximum 
values of g(r, s) arising from the boundary values of r, s. 

We begin with the boundary values of s. 

First, s = 0. Then g = r(1—r) and so M, = } when r = }. Then x = 1/722, 
y=0, or x=0, y = 1/¥V2. 


Secondly, s = 4. Then 
l r\* 
(-h)i-) 


g= 
We need only consider k < 2. Then the maximum M,; arises when r = 2/3 
giving M, = 4(2 — k)/27. We can reject this unless 4(2 — k)/27 >} or 
k < 5/16. Hence if k < 5/16, Mz = 4 (2 — k)/27 arising fromx = y = 1/3. 
When k = },, M2 = M, = }. 

We need only take the boundary value r = 1. Then g = (1 — ks)s* and the 
extremal value arises from s = 2/3k. This satisfies 0 < s < } only if k > 4/3. 
Then g = 4/27 < }. Hence if k < 4/3, a possible maximum may arise from 
s = 4 and then M; = (2 — k)/8 > } only if k < 0. Clearly M; < M2. 

To summarize, the boundary values give possible maxima M, = } for 
k > 5/16, and Mz = 4(2 — k)/27 when k < 5/16. 

We now consider non-boundary values of r, s. We put 

og dg 

aw" = 
Hence 

1—2r+3rs? =0, —k + kr — 3kr’s? + 2r’s = O. 

Multiply the first equation by k and add to the second. Then —kr + 2r*s = 0, 
and so 2rs = k since we need not consider r = 0. The solutions arising are 
certainly not admissible unless 0 < k <1 since we have excluded s = 0. 
Clearly 1 — 2r + 3k/4 = 0, and so 
ti a oe 

OO ere 
These must satisfy r <1 which is obvious, and s < 4 which requires 


3k? + 4 — 8k > Oork > 2/3. But then 


fem Z) 6) ~ EN _ Be +4, F) 
s 3k +4 S 8 


(4—k*)* 1 


T= 








64 wo 
Hence the maximum value of g arises from the boundary values. Then 
M = 4(2 — k)/27 or } according as k < 5/16 or k > 5/16. 
This disposes of the problem. 


Mount Allison University, Sackville, New Brunswick, Canada 
St John's College, Cambridge, England 





See ae 





FINITE-DIFFERENCE METHODS 
for PARTIAL DIFFERENTIAL EQUATIONS 


By George E. Forsythe, Stanford University, and Wolfgang R. Wasow, 
University of Wisconsin. Covers both initial-value and boundary-value 
problems, and emphasizes the topics of greatest importance in the solution of 
these problems with high-speed computers. 1960. 444 pages. $11.50. 


STATISTICAL THEORY and METHODOLOGY 
in SCIENCE and ENGINEERING 


By K. Alexander Brownlee, The University of Chicago. Designed partly for 
students in the experimental sciences and partly for statistics majors, the 
main objective of this book is to give both groups facility and self-confidence 
in the actual use of statistical methods. 1960. 5/0 pages. $16.75.* 


MODERN PROBABILITY THEORY and ITS APPLICATIONS 


By Emanuel Parzen, Stanford University. Introduces the reader to the basic 
concepts and techniques of probability theory without requiring advanced 
mathematical background. 1960. 464 pages. $10.75.* 


A PRIMER of REAL FUNCTIONS 


By Ralph P. Boas Jr., Northwestern University. This book offers an exposi- 
tion of the concepts and methods of “real variables” which can be readily 
understood by anyone who has had a course in calculus. The chief aim is 
the study of properties of various classes of functions and the relations 
between these properties. Carus Monograph #13. 1960. 189 pages. $4.00. 


STATISTICAL THEORY of COMMUNICATION 


By Y. W. Lee, Massachusetts Institute of Technology. The object of this 
book is to present clearly and rigorously a physically motivated and syste- 
matic account of the statistical theory of communication—an account that 
includes all of the basic elements. 1960. 509 pages. $16.75.* 


An INTRODUCTION to the THEORY of NUMBERS 


By Ivan Niven, University of Oregon, and Herbert S. Zuckerman, University 
of Washington. This approach to the theory of numbers is topical—as 
opposed to historical—stressing basic concepts at first, with specialized 
materials in the final three chapters. 1960. 250 pages. $6.25. 


MODERN TRIGONOMETRY 


By Dick Wick Hall, Harpur College, State University of New York, and 
Louis O. Kattsoff, Boston College. This book uses the analytical approach 
and emphasizes the ability to reason about the trigonometric functions. 
Rectangular and polar coordinates are introduced at the beginning and 

oaos, simultaneously throughout the text. 196]. Approx. 288 pages. Prob. 
4.95. 


*T extbook edition also available for college adoption. 


Reserve your examination copies today 


UNIVERSITY OF TORONTO PRESS Toronto, Ontario 








